CN114007037B - Video front-end intelligent monitoring system and method, computer equipment and terminal - Google Patents

Video front-end intelligent monitoring system and method, computer equipment and terminal Download PDF

Info

Publication number
CN114007037B
CN114007037B CN202111109933.0A CN202111109933A CN114007037B CN 114007037 B CN114007037 B CN 114007037B CN 202111109933 A CN202111109933 A CN 202111109933A CN 114007037 B CN114007037 B CN 114007037B
Authority
CN
China
Prior art keywords
zu9eg
video
hi3559a
coprocessor
chip
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111109933.0A
Other languages
Chinese (zh)
Other versions
CN114007037A (en
Inventor
颜露新
黎瑞
钟胜
曹旭航
蔡智
王健
龚恩
谭富中
朱太云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Hanning Information Technology Co ltd
Huazhong University of Science and Technology
Electric Power Research Institute of State Grid Anhui Electric Power Co Ltd
Original Assignee
Nanjing Hanning Information Technology Co ltd
Huazhong University of Science and Technology
Electric Power Research Institute of State Grid Anhui Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Hanning Information Technology Co ltd, Huazhong University of Science and Technology, Electric Power Research Institute of State Grid Anhui Electric Power Co Ltd filed Critical Nanjing Hanning Information Technology Co ltd
Priority to CN202111109933.0A priority Critical patent/CN114007037B/en
Publication of CN114007037A publication Critical patent/CN114007037A/en
Application granted granted Critical
Publication of CN114007037B publication Critical patent/CN114007037B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • G06F15/163Interprocessor communication
    • G06F15/17Interprocessor communication using an input/output type connection, e.g. channel, I/O port
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/60Memory management
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/76Television signal recording
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/508Monitor
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Hardware Design (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Neurology (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention belongs to the technical field of video monitoring, and discloses a video front-end intelligent monitoring system, a method, computer equipment and a terminal.A main processor is used for realizing a system image data interface, an external communication interface, image preprocessing and system management; the coprocessor is used for being responsible for the inference calculation of the convolutional neural network algorithm, and the calculation result is transmitted back to the main processor through the in-board communication interface; the power subsystem is used for supplying power to the whole system; the storage subsystem comprises a nonvolatile memory, an on-chip cache and an off-chip dynamic memory; the clock subsystem adopts an active crystal oscillator to provide each clock source required by each subsystem; the abundant data communication interface realizes the input/output of system video data, the input/output of control instructions and the data interconnection of the main processor and the coprocessor in the board. According to the invention, the calculation task is decomposed and distributed to the reasonable hardware execution units, so that the algorithm calculation speed is improved in heterogeneous acceleration, and the calculation performance of the system is fully exerted.

Description

Video front-end intelligent monitoring system and method, computer equipment and terminal
Technical Field
The invention belongs to the technical field of video monitoring, and particularly relates to a video front-end intelligent monitoring system and method, computer equipment and a terminal.
Background
At present, video monitoring is widely applied in various industries, and management departments can acquire image data through front-end cameras and monitor and memorize sudden abnormal events in time so as to provide efficient command and abnormal condition recording. The rapid development of the video monitoring market brings about the explosive increase of data, the traditional video monitoring is only based on pure video and does not have effective regular classified storage, for massive video data, data analysts can only search the video content of the accident according to the time, a large amount of time is consumed, the response to the accident is seriously delayed, and the manual monitoring is easy to generate report omission due to the fact that the labor force of human beings is limited. The traditional video monitoring system can not meet the application requirements of all industries, and the development of the video monitoring system towards intellectualization must be greatly promoted.
The video front-end intelligent monitoring system can be embedded into the front-end camera to realize the intelligent computing capability of the front-end camera, and aiming at data captured by massive monitoring cameras, the video front-end intelligent monitoring system can realize automatic detection, tracking, recognition, understanding and behavior analysis of targets by utilizing video image processing knowledge to extract useful information. The alarm can be given in time to prompt people to respond quickly. And aiming at event detection and identification, only the effective images are transmitted and stored, so that a large amount of transmission bandwidth can be saved, and the pressure of a back-end server and a memory is relieved. The video front-end intelligent monitoring system is designed according to the following requirements: the hardware has small volume, light weight and low power consumption; (2) The method has strong calculation power and needs to have the rapid deployment capability of a complex deep learning intelligent algorithm; (3) A high-speed video transmission interface is required to meet the requirements of multi-path ultrahigh-definition video input and output. Most of the existing video front-end intelligent monitoring systems only have simple intelligent algorithm deployment capability or only support video input of a single interface, and cannot simultaneously meet all application requirements. Therefore, a new video front-end intelligent monitoring system is needed to make up for the defects of the prior art.
Through the above analysis, the problems and defects of the prior art are as follows:
(1) The traditional video monitoring is only based on pure video and does not have effective regular classification storage, for massive video data, a data analyst can only search video content of an accident according to time, a large amount of time is consumed, and the response to the accident is delayed seriously.
(2) The existing manual monitoring method has the defects that the labor force of human beings is limited, and the report omission is easy to generate, so that the video monitoring system in the traditional sense can not meet the application requirements of all industries.
(3) Most of the existing video front-end intelligent monitoring systems only support video input of a single interface, are limited in computing power and only support simple intelligent algorithm deployment, and cannot meet the requirements of small hardware size, light weight and low power consumption.
The difficulty in solving the above problems and defects is:
in order to solve the above problems, a new video front-end intelligent monitoring system needs to be designed, which includes a software process with hardware devices and software and hardware cooperating. In the aspects of main chip model selection, power system design, storage system design, clock system design and data communication link design, the requirements of small size, light weight, low power consumption, strong computing power, fast communication and the like need to be fully considered. In terms of software, an efficient software and hardware cooperative flow needs to be designed, a calculation task is decomposed and distributed to a reasonable hardware execution unit, hardware calculation performance is fully exerted, and heterogeneous acceleration is achieved.
The significance of solving the problems and the defects is as follows:
the novel video front-end intelligent monitoring system has small volume, light weight and low power consumption, and can be simply and conveniently embedded into the existing monitoring equipment, so that the front-end video monitoring intelligence can be quickly realized. The method has strong deep learning computing capability, can support the deployment of various deep learning intelligent visual algorithms, can realize automatic detection, tracking, recognition, understanding and behavior analysis of targets in the video, extracts effective video data for storage, and greatly reduces the storage pressure of a back-end server. The automatic monitoring replaces manual monitoring, so that the labor cost is reduced, and the monitoring efficiency is improved.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a video front-end intelligent monitoring system, a video front-end intelligent monitoring method, computer equipment and a terminal.
The invention is realized in this way, a video front-end intelligent monitoring system, the hardware board of the video front-end intelligent monitoring system is composed of a sailing MPSoC, a Hai SoC and corresponding power supply circuits, a clock and reset circuit, a storage and configuration function circuit and a debugging/communication interface circuit;
the hardware board card of the video front-end intelligent monitoring system adopts a 3UVPX standard of 10 centimeters by 16 centimeters by 8 centimeters, the weight is 220g, a PCB adopts a 14-layer board design, the typical operation power consumption is 7.6W, the peak value calculation capacity of a single board is greater than 4TOPS, and the working temperature range is (0 ℃ and 70 ℃);
the video front-end intelligent monitoring system further comprises:
the main processor is used for realizing a system image data interface, an external communication interface, image preprocessing and system management;
the coprocessor is used for being responsible for the inference calculation of the convolutional neural network algorithm and transmitting the calculation result back to the main processor through the in-board communication interface;
the power subsystem is used for supplying power to the whole system;
the storage subsystem is used for storing the model parameters, configuration files and software codes in a non-volatile memory and storing intermediate layer characteristic diagram data generated in the forward reasoning calculation process of the convolutional neural network model in an on-chip cache and an off-chip dynamic memory;
the clock subsystem is used for providing each clock source required by each subsystem by adopting an active crystal oscillator;
and the data communication interface is used for realizing image data input/output and control instruction input/output of different video interfaces and realizing data interconnection of the main processor and the coprocessor through the high-speed data interface and the low-speed instruction interface.
Further, ZU9EG of the main processor Xilinx integrates programmable logic resources of an ARM CPU, a GPU and an FPGA;
the coprocessor is a Hi3559A audio and video processing chip and integrates a high-performance image signal processor ISP, a multi-core ARM CPU, a GPU and a neural network acceleration engine NNIE.
Further, the power subsystem includes:
the Hi3559A power supply subsystem improves the power supply conversion efficiency through two-stage voltage conversion, four paths of kernel power supplies with the specification of 0.8V, such as DVDD, DVDD _ CPU, DVDD _ GPU and DVDD _ MEDIA, supply power independently, and a dynamic voltage regulation technology is used in the operation process to reduce the power consumption of the system; after the MP8759 power chip is enabled by the power management interface, the single board inputs a 12V power and is converted into a 5V power; the TLV75518 chip provides an l.8V power supply for a power management control unit (PMC) integrated in a Hi3559A chip; after the PMC is started, sequentially enabling power management signals PWR _ SE [ 0;
the ZU9EG power supply subsystem consists of two IRPS5401 chips and one TDA21240 chip, and after the IRPS5401 power supply chip is enabled by the power supply management interface, an input 12V power supply is converted into each power supply required by the ZU9EG chips of 1.2V, 1.8V, 0.9V, 1.2V and the like; an external independent DC _ DC power supply chip TDA21240 chip is used for generating a core voltage of 0.85V capable of increasing a large current so as to meet the requirement of a ZU9EG core voltage peak current 20A. Further, the storage subsystem is composed of a volatile memory, an on-chip cache, and an off-chip dynamic memory. The main processor ZU9EG and the coprocessor Hi599A are externally connected with a DDR4 dynamic memory, an EMMC and a Flash nonvolatile memory except for an on-chip cache. The DDR4 dynamic memory of the main processor ZU9EG selects 4 MT40A512M16JY-075EAIT chips, eMMC selects H26M41208HPR chips, SPI Nor Flash selects MX25U25635F chips, SPI Nand Flash selects MX35UF2G14AC-Z4I chips; the DDR4 dynamic memory of the coprocessor Hi599A selects 4 MT40A512M16JY-083EAIT chips, the eMMC selects MTFC8GACAJCN-4M_IT_TR chip, and the SPI Nor Flash selects MT25QU512ABB8ESF-0SIT chip.
Further, the clock subsystem comprises four clock sources, a system reference clock PS _ REF _ CLK, a PCle reference clock PCIe _ CLK, an SRIO reference clock SRIO _ CLK and a system real-time clock PS _ RTC _ CLK; wherein the PS _ REF _ CLK frequency is 33.333MHz, the PCe _ CLK frequency is 100MHz, the SRIO _ CLK frequency is 156.25MHz, and the PS _ RTC _ CLK frequency is 32.768KHZ; the H3559A clock subsystem includes two clock sources: a system reference clock XIN and a system real-time clock RTC _ XIN; wherein the XIN frequency is 24MHz, and the RTC _ XIN frequency is 32.768KHz; in addition to the processing chip related clock, the ethernet physical layer transceiver EPHY integrated by the hardware system needs to use a 25MHz reference clock.
Further, the data communication interface includes:
ZU9EG _ SRIO, high-speed data interface, 20Gbit/s, used for system high definition video input/output;
hi3559A _ HDMI, high definition video interface, 5Gbit/s, used for system high definition video output;
ZU9EG _ PCIE _ Hi3559A, a high-speed data interface, 8Gbit/s, which is used for high-speed data transmission of ZU9EG and Haisi Hi3559A in a board;
ZU9EG _ LVDS _ Hi3559A, a high-speed data interface, 1.9Gbit/s, which is used for high-speed data transmission of ZU9EG and Haisi Hi3559A in a board;
ZU9EG _ BTG1120_ Hi3559A, high-speed data interface, 148.5Mbit/s, used for ZU9EG and Haisi Hi3559A high-speed data transmission in the board;
ZU9EG _ Ethernet, high-speed data interface, 1Gbit/s, used for high-speed data transmission \ debugging of the system;
hi3559A _ Ethernet, high-speed data interface, 1Gbit/s, used for Hi3559A debugging;
USB \ UART \ SPI \ IO, low speed data interface for controlling instruction output;
another objective of the present invention is to provide an intelligent monitoring method for a video front end, which comprises the following steps:
firstly, inputting a video image from a high-speed data interface SRIO or Ethernet of a main processor ZU9EG;
step two, the main processor ZU9EG selects an image enhancement method of image quantization/stretching and denoising, and image enhancement is achieved in parallel through pipeline operation;
step three, for large-format images, the main processor ZU9EG can perform image blocking on the images, and the images are cut into sub-images and then sent to Haisi processing;
step four, the main processor ZU9EG transmits the processed image to Haisi through a high-speed data communication interface LVDS/PCIE/BT1120 of a coprocessor Hi3559A;
step five, the coprocessor Hi3559A receives image data;
step six, the coprocessor Hi3559A completes image preprocessing and forward reasoning of a deep neural network model by using a convolutional neural network acceleration engine NNIE;
seventhly, post-processing the inference result of the deep neural network model by the aid of an ARM CPU (advanced RISC machine) through a coprocessor Hi3559A;
step eight, the coprocessor Hi3559A transmits the result to the main processor ZU9EG by using a low-speed data interface SPI/UART of the main processor;
step nine, the main processor ZU9EG packs the result data and outputs the result by using an external low-speed data interface SPI/UART, and the processed image is output by using an external high-speed data interface SRIO or Ethernet.
Furthermore, a four-core CPU and a dual-core NNIE are integrated in the coprocessor Hi3559A, and scheduling and data transmission of different computing resources including the NNIE and the CPU are related in the deep learning intelligent algorithm computing process based on the coprocessor; designing an application layer software flow of a multi-thread parallel coprocessor, wherein the application layer software flow comprises a main thread, an image receiving sub-thread, an NNIE computing sub-thread and a data communication sub-thread;
wherein, the software flow of the coprocessor is as follows:
(1) The main thread:
(1.1) initializing a system;
(1.2) initializing a neural network acceleration engine and loading a model;
(1.3) creating an image receiving sub-thread;
(1.4) creating a communication sub thread;
(1.5) creating an NNIE0/1 processing thread;
(1.6) waiting for the sub thread to quit;
(1.7) initializing the system;
(2) Image data stream thread:
(2.1) initializing a video input module;
(2.2) starting a video input module;
(2.3) initializing a video processing module;
(2.4) starting a video processing module;
(2.5) binding the input data stream;
(2.6) acquiring video frame data;
(2.7) writing to an image cache;
(2.8) repeating the step (2.6);
(3) NNIE0/1 computer thread
(3.1) caching the read image;
(3.2) forward reasoning calculation;
(3.3) post-treatment;
(3.4) result write caching;
(3.5) repeating the step (3.1);
(4) Data communication sub-thread:
(4.1) initializing a communication interface;
(4.2) caching the read result;
(4.3) data transmission;
(4.4) repeating the step (4.1).
Another object of the present invention is to provide a computer device, which includes a memory and a processor, wherein the memory stores a computer program, and the computer program, when executed by the processor, causes the processor to execute the video front-end intelligent monitoring method.
Another object of the present invention is to provide an information data processing terminal, which is used for implementing the functions of the video front-end intelligent monitoring system.
By combining all the technical schemes, the invention has the advantages and positive effects that: the video front-end intelligent monitoring system provided by the invention designs heterogeneous video front-end intelligent monitoring hardware based on a sailing MPSoC ZU9EG FPGA and a Huantian 3559A SoC, and provides an algorithm software implementation scheme based on software and hardware cooperation. According to the invention, through two processing systems, programmable logic resources, general processor resources and a special convolutional neural network accelerator are efficiently integrated, so that the video front-end intelligent monitoring hardware has the characteristics of flexibility, high performance and low power consumption. The hardware system of the invention has the following advantages:
(1) The single board has small volume (10 cm x 16 cm x 8 cm x2 cm), light weight (220 g) and low power consumption (7.6W of running power consumption) through reasonable chip type selection, power supply and storage design and reasonable PCB layout and wiring;
(2) The deep learning AI processor is selected, the heterogeneous system architecture design is adopted, the system has strong AI computing power, the peak computing power is greater than 4TOPS, and the rapid deployment of a complex deep learning intelligent algorithm can be supported;
(3) The system reserves various high-speed video data input/output interfaces 4X SRIO (transmission bandwidth 26 Gbps), gigabit Ethernet (1 Gbps), USB (480 Mbps) and the like, and the high-speed video data output interface HDMI can simultaneously meet the input and output of multi-channel ultra-high definition video.
The software scheme of the invention adopts software and hardware cooperative design, and fully exerts the hardware computing performance by decomposing the computing task and distributing the computing task to a reasonable hardware execution unit, thereby realizing heterogeneous acceleration. The deep learning intelligent detection recognition algorithm generally comprises five parts: image input, preprocessing, forward reasoning of a convolutional neural network model, postprocessing and result output. Based on the hardware system, the invention provides an algorithm software flow of software and hardware collaborative design, and realizes heterogeneous acceleration by decomposing a calculation task and distributing the calculation task to a reasonable hardware execution unit.
The hardware of the video front-end intelligent monitoring system has strong computing power, and the neural network inference engine integrated on Haisi 3559A chip designs a special hardware computing unit for each operator in the forward inference computation process of a convolutional neural network model, so that the video front-end intelligent monitoring system can support a complex deep learning intelligent detection and identification algorithm, for example: face recognition, pedestrian/vehicle detection, defect detection, smoke detection and the like, and the intellectualization of the video front-end monitoring system is realized.
In the hardware of the video front-end intelligent monitoring system provided by the invention, the board card adopts a 3U VPX standard (10 cm by 16 cm by 8 cm), the weight is 220g, the PCB adopts a 14-layer board design, the typical operation power consumption is 7.6W, the peak calculation capacity of the single board is more than 4TOPS, and the working temperature range is (0 ℃ and 70 ℃). The hardware system breaks through the problems of insufficient computing and storage resources and single data communication interface of the traditional video front-end intelligent monitoring system, and provides a good hardware basis for video front-end intelligent monitoring.
The front-end intelligent monitoring hardware consists of ZU9EG, hi3559A and peripheral circuits thereof, and in order to reduce the coupling degree of a power supply system and improve the reliability of the system, the power supply subsystems of two processing chips are respectively designed; the design of the power subsystem needs to consider the system efficiency and the area and thickness constraints of the hardware board card at the same time; each power supply of the Hi3559A power supply subsystem supplies power independently to maintain the stability of the system; the ZU9EG power supply subsystem adopts two IRPS5401 and one TDA21240 power supply ZU9EG power supply subsystem, thereby greatly reducing the number of devices, compressing the volume of the power supply subsystem and simplifying the power supply system.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments of the present invention will be briefly described below, and it is obvious that the drawings described below are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic block diagram of video front-end intelligent monitoring hardware provided in an embodiment of the present invention.
Fig. 2 is a schematic block diagram of Hi3559A according to an embodiment of the present invention.
Fig. 3 is a schematic diagram of a Hi3559A and a peripheral power tree thereof according to an embodiment of the present invention.
FIG. 4 is a schematic diagram of a ZU9EG and its peripheral power tree according to an embodiment of the present invention.
Fig. 5 is a block diagram of a video front-end intelligent monitoring hardware storage subsystem according to an embodiment of the present invention.
Fig. 6 is a block diagram of a clock subsystem provided by an embodiment of the invention.
Fig. 7 is a general calculation flowchart of the deep learning intelligent detection recognition algorithm according to the embodiment of the present invention.
Fig. 8 is a software flowchart of an algorithm of the video front-end intelligent monitoring system with software and hardware cooperation according to an embodiment of the present invention.
FIG. 9 is a flow chart of Hi3559A software provided in the embodiments of the present invention.
Fig. 10 is a data flow diagram for system performance testing according to an embodiment of the present invention.
Fig. 11 is a flowchart of a video front-end intelligent monitoring method according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and do not limit the invention.
In view of the problems in the prior art, the present invention provides a video front-end intelligent monitoring system and method, which are described in detail below with reference to the accompanying drawings.
As shown in fig. 1, the hardware board card of the video front-end intelligent monitoring system provided in the embodiment of the present invention is composed of a saint MPSoC, a hai si SoC and its corresponding power supply circuit, a clock and reset circuit, a storage and configuration function circuit, and a debugging/communication interface circuit;
the hardware board card of the video front-end intelligent monitoring system adopts a 3U VPX standard of 10 cm 16 cm 8 cm, the weight is 220g, the PCB adopts a 14-layer board design, the typical operation power consumption is 7.6W, the peak calculation capacity of a single board is larger than 4TOPS, and the working temperature range is (0 ℃,70 ℃).
The video front-end intelligent monitoring system provided by the embodiment of the invention also comprises a main processor, a video front-end intelligent monitoring system and a video front-end intelligent monitoring system, wherein the main processor is used for realizing a system image data interface, an external communication interface, image preprocessing and system management;
the coprocessor is used for being responsible for the inference calculation of the convolutional neural network algorithm and transmitting the calculation result back to the main processor through the in-board communication interface;
the power subsystem is used for supplying power to the whole system;
the storage subsystem is used for storing the model parameters, the configuration files and the software codes in a nonvolatile memory and storing the intermediate layer characteristic diagram data generated in the forward reasoning calculation process of the convolutional neural network model in an on-chip cache and an off-chip dynamic memory;
the clock subsystem is used for providing each clock source required by each subsystem by adopting an active crystal oscillator;
and the data communication interface is used for realizing image data input/output and control instruction input/output of different video interfaces and realizing data interconnection of the main processor and the coprocessor through the high-speed data interface and the low-speed instruction interface.
As shown in fig. 11, the video front-end intelligent monitoring method provided by the embodiment of the present invention includes the following steps:
s101, inputting a video image from a high-speed data interface SRIO or an Ethernet of a ZU9EG;
s102, ZU9EG selects an image enhancement method of image quantization/stretching and denoising, and image enhancement is achieved in parallel through pipeline operation;
s103, for the large-format image, ZU9EG can perform image blocking on the image, and the image is cut into sub-images and then sent to Haisi processing;
s104, the ZU9EG transmits the processed image to Haisi through a high-speed data communication interface LVDS/PCIE/BT1120 of the Haisi 3559A;
s105, receiving image data by Haesi 3559A;
s106, haisi 3559A completes image preprocessing and forward reasoning of a deep neural network model by using a convolutional neural network acceleration engine NNIE;
s107, post-processing the inference result of the deep neural network model by the Haisi 3559A through an ARM CPU;
s108, transmitting the result to the ZU9EG by the Haisi 3559A through a ZU9EG low-speed data interface SPI/UART;
and S109, the ZU9EG packages the result data and outputs the result by using an external low-speed data interface SPI/UART, and the processed image is output by using an external high-speed data interface SRIO or Ethernet.
The technical solution of the present invention is further described below with reference to specific examples.
Aiming at the problems in the prior art, the invention provides a video front-end intelligent monitoring system, which is characterized in that heterogeneous video front-end intelligent monitoring hardware is designed on the basis of a ZU9EG FPGA (field programmable gate array) of a Miss MPSoC and a Waishii 3559A SoC, and an algorithm software implementation scheme with software and hardware cooperation is provided on the basis of the hardware.
The hardware system has the following advantages:
(1) Through reasonable chip type selection, power supply and storage design and reasonable PCB layout and wiring, the single board is small in size (10 cm x 16 cm x 8 cm x2 cm), light in weight (220 g) and low in power consumption (7.6W in operation);
(2) The deep learning AI processor is selected, the heterogeneous system architecture design is adopted, the system has strong AI computing capacity, the peak computing capacity is greater than 4TOPS, and the rapid deployment of a complex deep learning intelligent algorithm can be supported;
(3) The system reserves various high-speed video data input/output interfaces 4X SRIO (transmission bandwidth 26 Gbps), gigabit Ethernet (1 Gbps), USB (480 Mbps) and the like, and the high-speed video data output interface HDMI can simultaneously meet the input and output of multi-channel ultra-high-definition videos.
The software scheme adopts software and hardware collaborative design, and the hardware computing performance is fully exerted by decomposing the computing task and distributing the computing task to a reasonable hardware execution unit, so that heterogeneous acceleration is realized.
The invention provides a video front-end intelligent monitoring system, which designs heterogeneous video front-end intelligent monitoring hardware based on a sailing MPSoC ZU9EG FPGA and a Huantian 3559A SoC, and provides an algorithm software implementation scheme based on software and hardware cooperation.
1. Hardware solution
The video front-end intelligent monitoring hardware schematic block diagram is shown in fig. 1, and the board mainly comprises a saint MPSoC, a hai si SoC and corresponding power supply circuits, a clock and reset circuit, a functional circuit (storage and configuration) and a debugging/communication interface circuit. The two processing systems efficiently integrate programmable logic resources, general processor resources and a special convolutional neural network accelerator, so that the video front-end intelligent monitoring hardware has the characteristics of flexibility, high performance and low power consumption.
(1) Master chip type selection
The invention adopts a sailing MPSoC ZU9EG FPGA and a Huashii 3559A SoC to construct front-end intelligent monitoring hardware. The ZU9EG is used as a main processor and is responsible for realizing system data interfaces, external communication interfaces, image preprocessing and system management functions. The Haisi 3559A is used as a coprocessor and is responsible for reasoning and calculation of a convolutional neural network algorithm, and a calculation result is transmitted back to the ZU9EG through a communication interface.
ZU9EG is a high-performance heterogeneous processor in the sailing MPSoC product series, and the processor chip integrates the programmable logic resources of ARM CPU, GPU and traditional FPGA, and has good performance in the aspects of system performance, flexibility and expandability. The ZU9EG device has two devices with different packaging specifications, a chip corresponding to a package FFVB1156 (35mm × 35mm) has 24 GTH (16.3 Gbps), and a chip corresponding to a package FFCV900 (31mm × 31mm) comprises 16 GTH. In consideration of the limited board area, the ZU9EG-xFFCV900x device is selected.
The Hi3559A is a high-performance low-power consumption audio/video processing chip produced by hua shi, and adopts a 12nm production process, and the schematic block diagram of the chip is shown in fig. 2, and a high-performance Image Signal Processor (ISP), a multi-core ARM CPU, a GPU and a Neural Network acceleration Engine (NNIE) are integrated in the chip. NNIE supports deep learning algorithm deployment with a peak power of 4TOPS (trillion integer arithmetic). And an advanced low-power-consumption process and a low-power-consumption architecture design are adopted, and the typical power consumption is 3W. Hi3559 devices have two devices with different package sizes, with package Hi3559A (25mm) pin pitch of 0.65mm and package Hi3559C (15mm) pin pitch of 0.4mm. Considering that the pitch of 0.4mm is small and the wiring difficulty is high, the Hi3559A device is selected.
(2) Power subsystem
The front-end intelligent monitoring hardware consists of ZU9EG, hi3559A and peripheral circuits thereof, and in order to reduce the coupling degree of a power supply system and improve the reliability of the system, the invention designs power supply subsystems of two processing chips respectively. The design of the power subsystem needs to consider the system efficiency and the area and thickness constraints of the hardware board card at the same time. Each power supply of the Hi3559A power supply subsystem supplies power independently to maintain the stability of the system; the ZU9EG power subsystem adopts two IRPS5401 and one TDA21240 power supply ZU9EG power subsystems, thereby greatly reducing the number of devices, compressing the volume of the power supply subsystem and simplifying the power supply system.
The Hi3559A power supply subsystem is shown in FIG. 3, the system improves the power supply conversion efficiency through two-stage voltage conversion, four paths of kernel power supplies (DVDD, DVDD _ CPU, DVDD _ GPU and DVDD _ MEDIA) with the specification of 0.8V are independently supplied, and a dynamic voltage regulation technology is used in the operation process to reduce the power consumption of the system. After the power management interface enables the MP8759 power chip, the single board input 12V power is converted into a 5V power. The TLV75518 chip provides an l.8V Power supply for a Power Management Control unit (PMC) integrated in the Hi3559A chip. After the PMC is started, sequentially enabling power management signals PWR _ SE [ 0.
As shown in FIG. 4, the ZU9EG power supply subsystem has rich resources in ZU9EG slices, and the topology structure of the power supply system becomes more complex due to independent power supply of each module. The invention combines power supply to the ZU9EG part module to simplify the power system design. According to the invention, two IRPS5401 and one TDA21240 power supply ZU9EG power supply subsystems are adopted, so that the number of devices is greatly reduced, the size of the power supply subsystem is reduced, and the simplified power supply system is beneficial to improving the reliability and maintainability of the system. According to ZU9EG power consumption simulation analysis, the current requirement of a core power supply (VCCINT) in a peak working state is 20A, so that an external independent DC _ DC power supply chip TDA21240 is used for supplying power. After the power management interface enables the IRPS5401 power chip, the single board inputs a 12V power supply and is converted to each path of power supply required by the ZU9EG chip.
(3) Storage subsystem
In the front-end intelligent monitoring system designed by the invention, the deep learning convolutional neural network model inference calculation task puts higher requirements on the storage capacity of front-end intelligent monitoring hardware, the capacity and bandwidth of a storage are fully considered when a storage subsystem is designed, a storage frame as shown in figure 5 is designed, model parameters, configuration files and software codes need to be stored in a non-volatile storage, and middle-layer characteristic diagram data generated in the forward inference calculation process of the convolutional neural network model is mainly stored in an on-chip cache and an off-chip dynamic storage. The invention selects ZU9EG chip and Haisi 3559A, DDR4 controller is integrated in Haisi 3559A chip, DDR4 adopts 8n bit pre-fetching architecture, each Bank Group in the chip has independent activating, reading, writing and refreshing operation, thereby effectively improving the whole efficiency and bandwidth. In addition, DDR4 working voltage is reduced to l.2V, and memory access power consumption can be effectively reduced. In order to further increase the DDR4 bandwidth, 4 pieces of DDR4 with 16bit width are spliced according to bits to form a 64bit width memory array.
The configuration of the external memory used by the processor in the hardware scheme is shown in table 1 and table 2. Hi3559A only has one SPI Flash interface, and the PCB is designed by adopting special-shaped packaging, so that SPI NOR Flash or SPI NAND Flash can be selectively welded during subsequent debugging.
TABLE 1 Haisi Hi3559A circumscribed memory parameters
Figure BDA0003270028040000141
TABLE 2 Sailing XCZU9EG external memory parameters
Figure BDA0003270028040000142
(3) Clock subsystem
In order to ensure the reliable operation of the system, each clock source required by the hardware system adopts an active crystal oscillator. As shown in fig. 6, the ZU9EG clock subsystem includes four clock sources PS _ REF _ CLK (system reference clock), PCIe _ CLK (PCle reference clock), SRIO _ CLK (SRIO reference clock), and PS _ RTC _ CLK (system real time clock). Where the frequency of PS _ REF _ CLK is 33.333MHZ, the frequency of PCe _CLKis 100MHZ, the frequency of SRIO _CLKis 156.25MHZ, and the frequency of PS _RTC _CLKis 32.768KHZ. The H3559A clock subsystem includes two clock sources: XIN (system reference clock), RTC _ XIN (system real time clock), where XIN frequency is 24MHz and RTC _XINfrequency is 32.768KHz. In addition to the above-mentioned processing chip-related clock, the Ethernet physical layer transceiver (EPHY) integrated with the hardware system needs to use a 25MHz reference clock
(4) Data communication interface
As shown in fig. 1, the hardware of the video front-end intelligent monitoring system has rich external interfaces, which can realize input/output of different video interfaces and input/output of control instructions, and the in-board ZU9EG and Hi3559A are interconnected through a high-speed data interface and a low-speed instruction interface, so as to complete data interaction between the ZU9EG and haisi 3559A. The specific interface is shown in table 3.
TABLE 3 hardware data communication interface of video front-end intelligent monitoring system
Figure BDA0003270028040000151
(5)
In the hardware of the video front-end intelligent monitoring system, the board adopts a 3U VPX standard (10 cm x 16 cm x 8 cm), the weight is 220g, the PCB adopts a 14-layer board design, the typical operation power consumption is 7.6W, the peak calculation capacity of the single board is more than 4TOPS, and the working temperature range is (0 ℃,70 ℃). The hardware system breaks through the problems of insufficient computing and storage resources and single data communication interface of the traditional video front-end intelligent monitoring system, and provides a good hardware basis for video front-end intelligent monitoring.
2. On software
The hardware of the video front-end intelligent monitoring system has strong computing power, and a neural network inference engine integrated on Haisi 3559A chip designs a special hardware computing unit for each operator in the forward inference computation process of a convolutional neural network model, so that the video front-end intelligent monitoring system can support a complex deep learning intelligent detection and identification algorithm, for example: face recognition, pedestrian/vehicle detection, defect detection, smoke detection and the like, and the intellectualization of the video front-end monitoring system is realized.
As shown in fig. 7, the deep learning intelligent detection recognition algorithm generally comprises five parts: image input, preprocessing, forward reasoning of a convolutional neural network model, postprocessing and result output. Based on the hardware system, the invention provides an algorithm software flow of software and hardware collaborative design, and realizes heterogeneous acceleration by decomposing a calculation task and distributing the calculation task to a reasonable hardware execution unit.
As shown in fig. 8, the software flow of the video front-end intelligent monitoring system algorithm is as follows:
(1) The video image is input from a high-speed data interface (SRIO or Ethernet) of the ZU9EG;
(2) The ZU9EG can adopt image enhancement methods such as image quantization/stretching, denoising and the like, and image enhancement is realized in parallel through pipeline operation;
(3) For large format images (e.g., 4096 × 4096), ZU9EG may perform image segmentation on the image, cut into sub-images and send into haisi processing;
(4) The ZU9EG transmits the processed image to Haisi through a high-speed data communication interface (LVDS/PCIE/BT 1120) of the Haisi 3559A;
(5) Seaside 3559A performs image data reception;
(6) Haisi 3559A utilizes a convolutional neural network acceleration engine (NNIE) to complete image preprocessing and forward reasoning of a deep neural network model;
(7) The Haisi 3559A utilizes an ARM CPU to carry out post-processing on an inference result of the deep neural network model;
(8) Haisi 3559A transmits results to ZU9EG using a and ZU9EG low speed data interface (SPI/UART);
(9) ZU9EG packs result data and outputs the result by using an external low-speed data interface (SPI/UART), and outputs the processed image by using an external high-speed data interface (SRIO or Ethernet);
a four-core CPU and a dual-core NNIE are integrated in the Hi3559A, scheduling and data transmission of NNIE, CPU and other different computing resources are related to the deep learning intelligent algorithm computing process based on the Hi3559A, and the invention designs a multithread parallel Hi3559A application layer software process, which mainly comprises the following steps: the computer comprises a main thread, an image receiving sub-thread, an NNIE computer sub-thread and a data communication sub-thread, and the concurrency of calculation is improved in a multi-thread parallel mode, so that the calculation performance of the Haisi chip is fully exerted. As shown in fig. 9, the Hi3559A software flow is as follows:
a main thread:
(1) Initializing a system;
(2) Initializing a neural network acceleration engine, and loading a model;
(3) Creating an image receiving sub-thread;
(4) Creating a communication sub-thread;
(5) Creating an NNIE0/1 processing thread;
(6) Waiting for the sub thread to quit;
(7) The system is initialized;
image data stream thread:
(1) Initializing a video input module;
(2) Starting a video input module;
(3) Initializing a video processing module;
(4) Starting a video processing module;
(5) Binding the input data stream;
(6) Acquiring video frame data;
(7) Writing into an image cache;
(8) Repeating the step (6);
NNIE0/1 computer thread
(1) Caching the read image;
(2) Forward reasoning calculation;
(3) Post-treatment;
(4) Writing the result into a cache;
(5) Repeating the step (1);
data communication sub-thread:
(1) Initializing a communication interface;
(2) Caching a read result;
(3) Data transmission;
(4) And (4) repeating the step (1).
The technical solution of the present invention is further described below with reference to simulation experiments.
Based on the video front-end intelligent monitoring system, a deep learning ship target detection algorithm is deployed to monitor a ship on the sea surface, and the ground test system is shown in fig. 10 and comprises video front-end intelligent monitoring system hardware (MPU), a test back plate, an FMC connecting line, an information injection plate and a serial port level converter. The information injection board adopts a ZC706 or ZCU102 development board of Xilinx, an Ethernet interface, a serial port and a 4XSRIO interface are integrated on the board, a network port of the information injection board is connected to a PC, an upper computer at the PC end sends image data to the information injection board through the network port, the image data is analyzed by the information injection board, the format of the image data is converted, and the image data is sent to an MPU through a 4X SRIO. The MPU analyzes and processes the image data, performs forward reasoning of the deep learning model, outputs a detection recognition result, outputs information such as a target type, a frame number of a picture where the target is located, coordinates and confidence of the target in the picture, and the like, transmits the information to a receiving end computer through a serial port level converter, completes the real-time display and storage functions of the processing result, and displays the image processing result of the upper computer.
As shown in table 4, for a 1024 × 1280 resolution image, the processing speed of the single-core NNIE is 12.3fps, the processing speed of the dual-core NNIE is 20.5fps, the information processing board outputs the detection result to the upper computer through the serial port to perform the algorithm precision test, when the IOU is 0.5, the detection precision rate is 0.851, the detection recall rate is 0.886, and the F1 exponent is 0.868, in order to accelerate the reasoning speed of the model on the seaside, the model conversion quantifies the floating point number to the integer number, and the precision performance of the model is slightly reduced.
TABLE 4 accuracy and speed test of intelligent monitoring system for ship target detection and identification algorithm at video front end
Figure BDA0003270028040000181
Figure BDA0003270028040000191
In the description of the present invention, "a plurality" means two or more unless otherwise specified; the terms "upper", "lower", "left", "right", "inner", "outer", "front", "rear", "head", "tail", and the like, indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, are only for convenience in describing and simplifying the description, and do not indicate or imply that the device or element referred to must have a particular orientation, be constructed in a particular orientation, and be operated, and thus, should not be construed as limiting the invention. Furthermore, the terms "first," "second," "third," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When used in whole or in part, can be implemented in a computer program product that includes one or more computer instructions. When loaded or executed on a computer, cause the flow or functions according to embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, the computer instructions may be transmitted from one website site, computer, server, or data center to another website site, computer, server, or data center via wire (e.g., coaxial cable, fiber optic, digital Subscriber Line (DSL), or wireless (e.g., infrared, wireless, microwave, etc.)). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that includes one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid State Disk (SSD)), among others.
The above description is only for the purpose of illustrating the present invention and the appended claims are not to be construed as limiting the scope of the invention, which is intended to cover all modifications, equivalents and improvements that are within the spirit and scope of the invention as defined by the appended claims.

Claims (9)

1. The intelligent video front-end monitoring method is characterized by comprising the following steps:
firstly, inputting a video image from a high-speed data interface SRIO or Ethernet of a main processor ZU9EG;
step two, the main processor ZU9EG selects an image enhancement method of image quantization, stretching and denoising, and image enhancement is achieved in parallel through pipeline operation;
step three, for large-format images, the main processor ZU9EG can perform image blocking on the images, and the images are cut into sub-images and then sent to a coprocessor Hi3559A for processing;
step four, the main processor ZU9EG transmits the processed image to the coprocessor Hi3559A through a high-speed data communication interface LVDS/PCIE/BT1120 of the coprocessor;
step five, the coprocessor Hi3559A receives image data;
step six, the coprocessor Hi3559A completes image preprocessing and forward reasoning of a deep neural network model by using a convolutional neural network acceleration engine NNIE;
seventhly, the coprocessor Hi3559A utilizes an ARM CPU to carry out post-processing on the inference result of the deep neural network model;
step eight, the coprocessor Hi3559A transmits the result to the main processing ZU9EG by using a low-speed data interface SPI/UART of a main processor;
and step nine, the main processor ZU9EG packages the result data and outputs the result by using an external low-speed data interface SPI/UART, and the processed image is output by using an external high-speed data interface SRIO or Ethernet.
2. The intelligent video front-end monitoring method according to claim 1, wherein a quad-core CPU and a dual-core NNIE are integrated in the coprocessor, and scheduling and data transmission of different computing resources including the NNIE and the CPU are involved in a deep learning intelligent algorithm computing process based on the coprocessor; designing an application layer software flow of a multi-thread parallel coprocessor, wherein the application layer software flow comprises a main thread, an image receiving sub-thread, an NNIE computer sub-thread and a data communication sub-thread;
wherein, the software flow of the coprocessor is as follows:
(1) The main thread:
(1.1) initializing a system;
(1.2) initializing a neural network acceleration engine and loading a model;
(1.3) creating an image receiving sub-thread;
(1.4) creating a communication sub-thread;
(1.5) creating an NNIE0/1 processing thread;
(1.6) waiting for the sub thread to quit;
(1.7) initializing the system;
(2) Image data stream thread:
(2.1) initializing a video input module;
(2.2) starting a video input module;
(2.3) initializing a video processing module;
(2.4) starting a video processing module;
(2.5) binding the input data stream;
(2.6) acquiring video frame data;
(2.7) writing to an image cache;
(2.8) repeating the step (2.6);
(3) NNIE0/1 computer thread
(3.1) caching the read image;
(3.2) forward reasoning calculation;
(3.3) post-treatment;
(3.4) result write caching;
(3.5) repeating the step (3.1);
(4) Data communication sub-thread:
(4.1) initializing a communication interface;
(4.2) caching the read result;
(4.3) data transmission;
(4.4) repeating the step (4.1).
3. A video front-end intelligent monitoring system based on the video front-end intelligent monitoring method according to any one of claims 1 to 2, wherein the video front-end intelligent monitoring system comprises:
the main processor is used for realizing a system image data interface, an external communication interface, image preprocessing and system management;
the coprocessor is used for being responsible for the inference calculation of the convolutional neural network algorithm and transmitting the calculation result back to the main processor through the in-board communication interface;
the power supply subsystem is used for supplying power to the whole system, and the main processor, the coprocessor and other peripheral equipment supply power independently, so that the coupling degree of the power supply system is reduced, and the reliability of the system is improved;
the storage subsystem is used for storing the model parameters, the configuration files and the software codes in a nonvolatile memory and storing the intermediate layer characteristic diagram data generated in the forward reasoning calculation process of the convolutional neural network model in an on-chip cache and an off-chip dynamic memory;
the clock subsystem is used for providing each clock source required by each subsystem by adopting an active crystal oscillator;
and the data communication interface is used for realizing image data input/output and control instruction input/output of different video interfaces and realizing data interconnection of the main processor and the coprocessor through a high-speed data interface and a low-speed instruction interface.
4. The video front-end intelligent monitoring system of claim 3, wherein the main processor is ZU9EG, programmable logic resources integrating ARM CPU, GPU and FPGA;
the coprocessor is a Hi3559A audio and video processing chip and integrates a high-performance image signal processor ISP, a multi-core ARM CPU, a GPU and a neural network acceleration engine NNIE.
5. The video front-end intelligent monitoring system of claim 3, wherein the power subsystem comprises:
the ZU9EG power supply subsystem consists of two IRPS5401 chips and one TDA21240 chip, and after the IRPS5401 power supply chips are enabled by the power supply management interface, an input 12V power supply is converted into each power supply required by the 1.2V, 1.8V or 0.9V ZU9EG chip; an external independent DC _ DC power supply chip TDA21240 chip is used for generating a core 0.85V voltage capable of providing a large current so as to meet the requirement of ZU9EG core voltage peak current 20A.
6. The intelligent video front-end monitoring system according to claim 3, wherein the storage subsystem is composed of a nonvolatile memory, an on-chip cache and an off-chip dynamic memory, the main processor ZU9EG and the coprocessor Hi3559A are externally connected with a DDR4 dynamic memory, an EMMC and a Flash nonvolatile memory except for the on-chip cache, the DDR4 dynamic memory of the main processor ZU9EG is selected from 4 MT40A512M16JY-075EAIT chips, the eMMC is selected from an H26M41208HPR chip, the SPI Nor Flash is selected from an MX25U25635F chip and the SPI Nand Flash is selected from an MX35UF2G14AC-Z4I chip; the DDR4 dynamic memory of the coprocessor Hi3599A selects 4 MT40A512M16JY-083EAIT chips, the eMMC selects MTFC8GACAJCN-4M_IT_TR chip, and the SPI Nor Flash selects MT25QU512ABB8ESF-0SIT chip.
7. The intelligent video front-end monitoring system according to claim 3, wherein the ZU9EG clock subsystem comprises four clock sources, a system reference clock PS _ REF _ CLK, a PCle reference clock PCIe _ CLK, an SRIO reference clock SRIO _ CLK and a system real-time clock PS _ RTC _ CLK; wherein the PS _ REF _ CLK frequency is 33.333MHz, the PCe _ CLK frequency is 100MHz, the SRIO _ CLK frequency is 156.25MHz, and the PS _ RTC _ CLK frequency is 32.768KHZ; the Hi3559A clock subsystem includes two clock sources: a system reference clock XIN and a system real-time clock RTC _ XIN; wherein the XIN frequency is 24MHz, and the RTC _ XIN frequency is 32.768KHz; besides the related clock of the processing chip, the ethernet physical layer transceiver EPHY integrated by the hardware system needs to use a reference clock of 25 MHz.
8. The video front-end intelligent monitoring system of claim 3, characterized in that the data communication interface comprises:
ZU9EG _ SRIO, high-speed data interface, peak rate 20Gbit/s, used for system high-definition video input/output;
hi3559A _ HDMI, high definition video interface, peak rate 5Gbit/s, for system high definition video output;
ZU9EG _ PCIE _ Hi3559A, high-speed data interface, peak rate 8Gbit/s, used for ZU9EG and Hi3559A high-speed data transmission in the board;
ZU9EG _ LVDS _ Hi3559A, a high-speed data interface, with a peak rate of 1.9Gbit/s, for high-speed data transmission of ZU9EG and Hi3559A in a board;
ZU9EG _ BT1120_ Hi3559A, high speed data interface, peak rate 148.5Mbit/s, for on-board ZU9EG and Hi3559A high speed data transmission;
the ZU9EG _ Ethernet, a high-speed data interface and a peak rate of 1Gbit/s are used for high-speed data transmission of a system or ZU9EG debugging;
hi3559A _ Ethernet, high-speed data interface, peak rate 1Gbit/s, used for Hi3559A debugging;
USB/UART/SPI/IO, low speed data interface for control instruction output.
9. A computer device, characterized in that the computer device comprises a memory and a processor, the memory stores a computer program, and the computer program, when executed by the processor, causes the processor to execute the video front-end intelligent monitoring method according to any one of claims 1 to 2.
CN202111109933.0A 2021-09-18 2021-09-18 Video front-end intelligent monitoring system and method, computer equipment and terminal Active CN114007037B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111109933.0A CN114007037B (en) 2021-09-18 2021-09-18 Video front-end intelligent monitoring system and method, computer equipment and terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111109933.0A CN114007037B (en) 2021-09-18 2021-09-18 Video front-end intelligent monitoring system and method, computer equipment and terminal

Publications (2)

Publication Number Publication Date
CN114007037A CN114007037A (en) 2022-02-01
CN114007037B true CN114007037B (en) 2023-03-07

Family

ID=79922027

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111109933.0A Active CN114007037B (en) 2021-09-18 2021-09-18 Video front-end intelligent monitoring system and method, computer equipment and terminal

Country Status (1)

Country Link
CN (1) CN114007037B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116934728A (en) * 2023-07-31 2023-10-24 江苏济远医疗科技有限公司 Hysteroscope image target detection acceleration method based on embedded AI processor
CN117710271B (en) * 2024-02-06 2024-04-19 成都戎盛科技有限公司 Transparency processing method and system based on Hai Si 2D acceleration platform
CN117893391A (en) * 2024-03-14 2024-04-16 华中科技大学 Multispectral image intelligent processing system and multispectral image intelligent processing method for unmanned aerial vehicle platform

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008128901A1 (en) * 2007-04-23 2008-10-30 International Business Machines Corporation Heterogeneous image processing system
CN109993303A (en) * 2019-03-29 2019-07-09 河南九乾电子科技有限公司 Computer accelerator for neural network and deep learning
CN110321378A (en) * 2019-06-03 2019-10-11 梁勇 A kind of mobile monitor image identification system and method
CN110569737A (en) * 2019-08-15 2019-12-13 深圳华北工控软件技术有限公司 Face recognition deep learning method and face recognition acceleration camera
CN110717433A (en) * 2019-09-30 2020-01-21 华中科技大学 Deep learning-based traffic violation analysis method and device
CN111083443A (en) * 2019-12-25 2020-04-28 中山大学 Monitoring center auxiliary system and method based on deep learning
CN111709522A (en) * 2020-05-21 2020-09-25 哈尔滨工业大学 Deep learning target detection system based on server-embedded cooperation

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8331737B2 (en) * 2007-04-23 2012-12-11 International Business Machines Corporation Heterogeneous image processing system
US8442927B2 (en) * 2009-07-30 2013-05-14 Nec Laboratories America, Inc. Dynamically configurable, multi-ported co-processor for convolutional neural networks
US10099614B2 (en) * 2011-11-28 2018-10-16 Magna Electronics Inc. Vision system for vehicle
WO2018184192A1 (en) * 2017-04-07 2018-10-11 Intel Corporation Methods and systems using camera devices for deep channel and convolutional neural network images and formats
CN113168541A (en) * 2018-10-15 2021-07-23 菲力尔商业系统公司 Deep learning inference system and method for imaging system

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008128901A1 (en) * 2007-04-23 2008-10-30 International Business Machines Corporation Heterogeneous image processing system
CN109993303A (en) * 2019-03-29 2019-07-09 河南九乾电子科技有限公司 Computer accelerator for neural network and deep learning
CN110321378A (en) * 2019-06-03 2019-10-11 梁勇 A kind of mobile monitor image identification system and method
CN110569737A (en) * 2019-08-15 2019-12-13 深圳华北工控软件技术有限公司 Face recognition deep learning method and face recognition acceleration camera
CN110717433A (en) * 2019-09-30 2020-01-21 华中科技大学 Deep learning-based traffic violation analysis method and device
CN111083443A (en) * 2019-12-25 2020-04-28 中山大学 Monitoring center auxiliary system and method based on deep learning
CN111709522A (en) * 2020-05-21 2020-09-25 哈尔滨工业大学 Deep learning target detection system based on server-embedded cooperation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于UltraScale FPGA的智能视频高速数据处理系统关键技术研究;林玉凤;《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》;20190515(第5期);全文 *

Also Published As

Publication number Publication date
CN114007037A (en) 2022-02-01

Similar Documents

Publication Publication Date Title
CN114007037B (en) Video front-end intelligent monitoring system and method, computer equipment and terminal
Feng et al. Computer vision algorithms and hardware implementations: A survey
US11703939B2 (en) Signal processing device and related products
US11562115B2 (en) Configurable accelerator framework including a stream switch having a plurality of unidirectional stream links
EP3346423B1 (en) Deep convolutional network heterogeneous architecture system and device
EP3346426B1 (en) Reconfigurable interconnect, corresponding system and method
EP3660628A1 (en) Dynamic voltage frequency scaling device and method
EP3346425B1 (en) Hardware accelerator engine and method
US11740941B2 (en) Method of accelerating execution of machine learning based application tasks in a computing device
CN111159093B (en) Heterogeneous intelligent computing system
CN110751676A (en) Heterogeneous computing system and method based on target detection and readable storage medium
US11967150B2 (en) Parallel video processing systems
JP7268063B2 (en) System and method for low-power real-time object detection
CN104850516B (en) A kind of DDR Frequency Conversion Designs method and apparatus
Chang et al. A memory-optimized and energy-efficient CNN acceleration architecture based on FPGA
CN111860773A (en) Processing apparatus and method for information processing
US20190354159A1 (en) Convolutional operation device and method
Wang et al. A retrospective evaluation of energy-efficient object detection solutions on embedded devices
EP4311202A1 (en) End-edge-cloud coordination system and method based on digital retina, and device
Ying et al. Exploiting Frame Similarity for Efficient Inference on Edge Devices
CN113704156B (en) Sensing data processing device, board card, system and method
CN114217688B (en) NPU power consumption optimization system and method based on neural network structure
CN113469326B (en) Integrated circuit device and board for executing pruning optimization in neural network model
Bai et al. An OpenCL-based FPGA accelerator with the Winograd’s minimal filtering algorithm for convolution neuron networks
CN109993286A (en) The calculation method and Related product of sparse neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant