CN112162554A

CN112162554A - Data storage and backtracking platform for N3 sweeper

Info

Publication number: CN112162554A
Application number: CN202011006589.8A
Authority: CN
Inventors: 于远彬; 罗春麒; 蒋劲宇
Original assignee: Jilin University
Current assignee: Jilin University
Priority date: 2020-09-23
Filing date: 2020-09-23
Publication date: 2021-01-01
Anticipated expiration: 2040-09-23
Also published as: CN112162554B

Abstract

The invention discloses a data storage and backtracking platform for an N3-class electric sweeper, which consists of an on-line end and an off-line end; the online end is used for carrying out data online acquisition and storage on road garbage information, collecting the current loading gear operation of the sweeper, providing data for the loading gear selection module, providing the optimal loading gear selection under the current garbage scene for the online vehicle speed optimization control module, and realizing decoupling of loading operation and vehicle speed control; the off-line end performs the training of the upper-mounted gear selection algorithm by using the data collected at the on-line end to obtain relevant parameters in the upper-mounted gear selection algorithm, sends the relevant parameters to the on-line end upper-mounted gear selection module, is used for iteratively updating the upper-mounted gear control algorithm at the on-line end, fuses neural network parameters in the on-line speed optimization control modules of a plurality of cleaning vehicles and transmits the neural network parameters to each cleaning vehicle; and the on-line end and the off-line end realize data transmission through a control information updating system.

Description

Data storage and backtracking platform for N3 sweeper

Technical Field

The invention relates to the field of intelligent control of sweeper trucks, in particular to a data storage and backtracking platform for N3 sweeper trucks.

Background

At present, to realize the intellectuality of N3 class pure electric motor sweeper road operations, need: (1) carrying out a power matching test in advance, and manually arranging a scene and then selecting an optimal gear to obtain power matching test data; (2) training a power matching model by using the power matching test data to simulate the operation behavior of the driver; (3) during actual cleaning operation, the types of the road surface garbage of the current scene and the coverage rate information corresponding to the types, which are output based on the vision sensor and the perception algorithm, are input into a model obtained by training power matching test data, so that the gear of the loading system required by the current scene is automatically output.

The technical route of the intelligent operation of the sweeper has the following problems: (1) the number and the types of artificially arranged scenes cannot truly reflect complex and diverse road conditions, so that the data of the power matching test cannot truly reflect the real road, and the precision and the accuracy of the trained power matching model are influenced; (2) each power matching test needs to arrange a scene first and then select an optimal gear for operation, which means that a large amount of labor, material and time costs are required for obtaining power matching data, and the power matching model cannot be updated continuously.

In order to solve the problems, the method includes that operation scene data and corresponding vehicle running data are collected continuously in the daily operation process to replace power matching test data, behavior characteristics of a driver are analyzed through a machine learning algorithm by the aid of the data, an intelligent control algorithm capable of being updated in an iterative mode is formed to replace an original power matching model, and accordingly the intelligent operation algorithm can be continuously updated.

Disclosure of Invention

The invention aims to design a data storage and backtracking platform for an N3 type electric sweeper, aims to describe information seen by a driver in a video data conversion mode, synchronously stores the information and the operation action of the driver through a data synchronization technology, analyzes the loading operation behavior of the driver through an loading gear selection algorithm and analyzes a speed optimization control rule in a sweeping operation mode through an online speed optimization control algorithm, and further realizes full-automatic intelligent control of the whole sweeper.

The purpose of the invention is realized by the following technical scheme:

a data storage and backtracking platform for an N3 type electric sweeper comprises an on-line end and an off-line end;

the online end comprises a data online acquisition and storage system, an upper gear selection module and an online vehicle speed optimization control module; the data online acquisition and storage system is used for acquiring and storing data online to the road garbage information, collecting the current gear loading operation of the sweeper and providing data for the gear loading selection module; the loading gear selection module provides the best loading gear selection under the current garbage scene for the online vehicle speed optimization control module, and the best loading gear selection is used as one input of the online vehicle speed optimization control module to realize the decoupling of loading operation and vehicle speed control;

the off-line end comprises an upper gear selection training module based on a deep confidence network and a vehicle speed optimization control neural network parameter fusion module; the method comprises the steps that an uploading gear selection training module based on a deep belief network performs uploading gear selection algorithm training by using data collected at a line end to obtain relevant parameters in an uploading gear selection algorithm and send the parameters to the uploading gear selection module at the line end for iteratively updating an uploading gear control algorithm at the line end; the vehicle speed optimization control neural network parameter fusion module fuses neural network parameters in the online vehicle speed optimization control modules of the multiple motor sweeper and transmits the fused neural network parameters to the motor sweeper, so that experience sharing of speed control of the multiple motor sweeper is realized;

and the on-line end and the off-line end realize data transmission through a control information updating system.

Furthermore, the data online acquisition and storage system comprises a data acquisition module and a data storage module, wherein the data acquisition module is responsible for acquiring information from a front camera, a rear camera and a CAN bus of the sweeper, cleaning the acquired information, eliminating unreasonable data, realizing data synchronization, and constructing input data required by an upper gear selection module and an online vehicle speed optimization control module; the data storage module provides temporary storage space for the data constructed by the data acquisition module and the operation video of the current day, screens the data, screens out operation scenes with unsatisfactory operation effect, changes gear information of the operation scenes, and stores the data in a designated area for the offline end to relearn the operation scene information.

Further, the data acquisition module comprises the following working steps:

step one, data discretization: carrying out discretization sampling on video information acquired by the front camera and the rear camera to form a picture sequence with the same time interval and a time label;

step two, data cleaning: checking whether each attribute of the discretized picture sequence and data transmitted by a CAN bus is complete and reasonable, eliminating picture data with blocked pictures and filling up the missing data attributes;

step three, garbage feature extraction: obtaining the type information and the position distribution information of the garbage by using a road garbage recognition algorithm based on fast-RCNN and image area division, and obtaining the coverage rate information before and after cleaning of each garbage by using a garbage coverage rate recognition algorithm and camera calibration data which is constructed through tests and aims at different vehicle types; finally, folding three types of information of garbage type, position distribution and coverage rate before cleaning into a row vector, which is called a garbage vector;

step four, data synchronization: and selecting a certain time point as a reference, matching the various types of information extracted in the step three according to time, eliminating data of redundant time nodes and invalid training data of a clean road surface, and realizing synchronization of various types of data and construction of a scene array.

Further, the upper gear selection module and the upper gear selection training module of the depth confidence network at the off-line end perform data transmission to realize the upper gear selection of the sweeper, and the upper gear selection algorithm comprises the following steps:

step one, data acquisition and uploading: transmitting scene vectors which are accumulated on the line end and are qualified in loading operation and scene vectors which are accumulated on the line end, are unqualified in loading operation and are corrected to an offline end through an OTA communication module;

step two, the off-line end conducts loading gear selection training and parameter fusion through an loading gear selection training module based on the deep belief network:

and step three, decompressing the update data packet sent from the off-line end by the on-line end and using the decompressed update data packet to update the DBN classifier of the on-line end to realize the continuous update of the uploading selection algorithm.

And step four, when the online terminal works, inputting the scene vector constructed in real time by the data cleaning module into the DBN classifier, outputting the optimal loading gear by the DBN classifier by using the latest loading selection algorithm, and outputting the information of the loading gear to the loading controller and the online vehicle speed optimization control module.

Furthermore, the specific working steps of the gear selection training module installed on the basis of the deep confidence network are as follows:

1) pretreatment: before training, extracting junk vectors from scene vectors uploaded at a line end according to an upper gear label, and constructing a corresponding data set from the extracted junk vectors according to the upper gear label;

2) determining a training architecture: a training structure of parallel training is adopted, the input of each DBN is a data set corresponding to a class of uploading gear labels, the output of each DBN is a corresponding uploading gear label, the hyper-parameters of the DBN are set, and a final single DBN model is determined;

3) parallel training process: during training, each DBN network independently trains a corresponding data set, and parameters obtained by training each DBN network are fused after the training is finished;

4) and packaging the fused DBN network parameters to serve as an update data packet for installing a gear selection module on the line end.

Further, the parallel training step in step 3) is as follows:

(1) the parameter scheduler is used for distributing DBN initial parameters to a plurality of parallel DBN networks and initializing parameters theta { W, b, c } of the parallel DBN networks, wherein W is a weight matrix, b is the offset of a visible node, and c is the offset of an implicit node;

(2) respectively importing training data of different gear labels into a corresponding parallel DBN network, wherein each DBN network trains the data of one gear label;

(3) after training is finished, comparing the parameter of each DBN network with an initial value, changing the parameter of each DBN network with the initial value, recording the change as delta theta, summarizing the delta theta generated after each DBN network is trained by a parameter scheduler, and updating the original DBN parameters;

(4) and for the updated DBN parameters, the parameter scheduler distributes the updated DBN parameters to each DBN network as initial network parameters for next training on one hand, and sends the parameters to the online end through the OTA communication module to update the DBN classifier which runs online on the other hand.

Further, an online vehicle speed optimization control algorithm based on a depth certainty strategy gradient and a multi-agent system is adopted in the online vehicle speed optimization control module, and the algorithm steps are as follows:

step one, during online work, real-time gear and vehicle speed control is carried out based on a DDPG algorithm, and own neural network parameters are continuously updated;

step two, when all the cleaning vehicles finish working, the respective neural network parameters and the average rate of return R are used_jUploading to an off-line end, fusing the parameters of each vehicle by using a parameter fusion mode in an off-line end public control system, and updating the target network parameters;

and step three, the target network parameters updated from the off-line end are sent to the on-line end through the OTA communication module, and the target network parameters of the on-line end during the next day of work are updated.

Further, the control process of the gear and the vehicle speed in the step one is as follows:

(1) downloading Target neural Network parameters in the public control system to the sweeping vehicle controller to serve as initial parameters of an Actor Network and a Critic Network and parameters of the Actor Target Network and the Critic Target Network of the sweeping work;

(2) when the sweeper works, the optimal loading gear at the moment is obtained from the DBN network of the loading gear selection module at each moment; during initial work, the upper gear and the vehicle state parameters are input to the Actor Network together, and the optimal vehicle speed and the gear a at the moment are output_tFor vehiclesControlling;

(3) then, the optimal vehicle speed and the gear a at the moment are measured_tAnd a vehicle state parameter s_tInputting the input into a Critic Network and outputting a corresponding action value function Q_t(s_t,a_t)；

(4) The vehicle state parameter s at the next moment_t+1Input to the Actor Target Network, and output the optimal speed and gear a_t+1(ii) a Will s_t+1And a_t+1The common inputs are input into a critical Target Network, and the action value function Q at the moment is output_t+1(s_t+1,a_t+1) And calculating a return function value R at the moment;

the return function value R is a weighted function of the cleaning rate and the energy utilization rate, where P_{Is effective}Power for the vehicle to overcome rolling resistance, P_{Drive the}The total required power of the vehicle, beta and gamma are weights of cleaning rate and energy utilization rate;

(5) updating the critical Network parameter ω by gradient back-propagation of the neural Network using the mean square error loss function J (ω) every N steps:

(6) updating an Actor Network parameter theta:

further, the vehicle speed optimization control neural network parameter fusion module performs data transmission with the on-line vehicle speed optimization control module at the end of the vehicle, so as to realize neural network parameter fusion of the sweeper vehicle speed optimization control module, and the neural network parameter fusion process is as follows:

(1) uploading neural network parameters of the multi-agent system: when all the cleaning vehicles finish working, the cleaning vehicles will be usedRespective neural network parameters theta₁,θ₂…θ_l,ω₁,ω₂…ω_lAnd average rate of return

Uploading to an off-line end through an OTA communication module;

(2) fusing neural network parameters of the multi-agent system: each sweeper represents one agent, and parameters transmitted by each agent are fused according to the following method:

where τ denotes the proportion of the original parameter retained, α_jUpdating the weight of each vehicle parameter according to the average rate of return of the vehicle on the same day

It is determined that,

further, the control information updating system comprises a data uploading module based on the OTA technology and a control program upgrading and updating module;

the data uploading module specifically comprises the following working processes:

(1) checking whether the current sweeper and the network state can upload data at the end;

(2) the method comprises the steps that a data transmission application is provided from a line end to an off-line end, the off-line end agrees to upload data after performing identity verification on the off-line end, and a buffer area is opened up for temporary storage of the uploaded data;

(3) the data uploading is finished at the line end, and an end mark is sent to the off-line end;

(4) after the off-line end is connected with the end mark, the connection between the off-line end and the cache region is disconnected, and network resources are released;

(5) the offline terminal extracts data from the cache region;

(6) clearing the uploaded data at the online end, releasing the storage space, and storing the data in the next round;

the control program upgrading and updating module specifically works as follows:

(1) after the off-line end completes training by using the training end transmitted from the on-line end, packing the trained neural network parameters and distributing the parameters to the OTA management platform;

(2) bidirectional identity verification is carried out between the OTA management platform and the managed sweeper, and a new algorithm to be updated and the version number of the new algorithm are broadcasted;

(3) comparing the algorithm version number of the vehicle on the line end, and detecting whether the current vehicle condition is suitable for updating;

(4) after the line terminal confirms that the current vehicle condition is suitable for updating, an updating request is sent to the off-line terminal;

(5) the off-line end transmits the algorithm updating packet to the on-line end in an encrypted manner; verifying the integrity of the data packet after the line termination is subject to the algorithm update packet;

(6) after the integrity is checked to be qualified, distributing the updating data of each algorithm to the respective controller;

(7) each controller updates the algorithm by using the updating data in a dual-system backup redundancy upgrading mode;

(8) after the algorithm is updated, debugging the actual road operation is carried out, and after the debugging is successful, the upgrading success information and the current version number are reported to an offline end at the end;

(9) and finishing updating and releasing the network resources.

The invention has the beneficial effects that:

(1) discretizing the video stream acquired by the camera, converting the video stream into a picture sequence with discrete time labels, extracting the features of the discretized picture based on a junk type recognition algorithm and a coverage rate recognition algorithm, and storing the picture in a row vector form, so that the storage and transmission cost is reduced, and the subsequent analysis is facilitated;

(2) by taking the real-time operation scene data and the corresponding vehicle operation data as the training data of the control algorithm, the time-consuming and labor-consuming power matching test can be avoided, the actual road operation scene can be reflected in real time, and the precision and accuracy of the control algorithm are improved;

(3) for data generated by unqualified loading operation, the data is corrected by utilizing the relation between the garbage coverage rate before and after cleaning and the gear in all the data on the day, and the corrected data is used as data for further training by an off-line end algorithm, so that the utilization rate of the data is improved, and the training data is rapidly accumulated;

(4) the collected state information is stored and transmitted by using an OTA Technology (Over-the-Air Technology, spatial download Technology), the data collection capability of an on-line end and the computing resources of an off-line end can be fully utilized, and the wireless iterative update of on-line control algorithm parameters and an off-line control algorithm training database can be realized;

(5) the efficiency of off-line end DBN classifier training is effectively improved by building a parallel DBN network architecture;

(6) the DDPG deep reinforcement learning algorithm is used for carrying out online optimization on the speed of the sweeping vehicle in the sweeping mode so as to improve the energy utilization rate of the vehicle;

(7) by adopting a multi-vehicle parallel operation mode, the cloud sharing of the operation data and the vehicle speed control parameters of a plurality of cleaning vehicles can be realized, and the data acquisition efficiency can be greatly improved.

Drawings

FIG. 1 is a flow chart of the screening and modification of misoperation data

FIG. 2 is a flowchart of an upper gear selection algorithm

FIG. 3 is a parallel DBN network training architecture

FIG. 4 is a single DBN network architecture

FIG. 5 Single DBN network training Process diagram

FIG. 6 flow chart of vehicle speed optimization control algorithm

Fig. 7 on-line OTA hardware implementation architecture diagram

FIG. 8 software program OTA upgrade flow chart

FIG. 9 is a schematic block diagram of the data storage and backtracking platform of the N3-class electric sweeper

Detailed Description

The specific implementation mode of the invention is described by taking a certain type N3 pure electric sweeper as an example, the sweeper is provided with an automatic transmission and has two working modes: a cleaning working mode (low vehicle speed) and a transition transportation mode (normal vehicle speed); in the cleaning mode, three upper gear positions are provided, and the upper gear positions are selected in two modes of manual operation and automatic operation.

The data storage and backtracking platform for the N3 type electric sweeper is designed, and two core functions of gear selection of the upper loader and online vehicle speed optimization control in a sweeping operation mode of the N3 type electric sweeper can be realized.

The development of the upper gear selection algorithm is divided into three stages: data collection phase, debugging phase and actual application phase. The specific contents of the three stages are as follows:

(1) data collection phase of algorithm development. The method mainly comprises the steps that an excellent driver operates the sweeper, and the original training data are accumulated by recording the scene of the garbage seen by the excellent sweeper driver and corresponding loading operation data. By transmitting the original training data to the off-line end, the corresponding uploading gear selection algorithm is trained by utilizing abundant computing resources of the off-line end.

(2) Debugging phase of algorithm development. At the moment, the original data is accumulated to a certain degree, the loading gear selection algorithm can simulate a driver to make corresponding loading gear selection operation according to scene information identified by the junk identification algorithm, the loading gear selection algorithm is transplanted to a working condition machine of the sweeper, and meanwhile, an indicator lamp corresponding to the gear selected by the loading gear selection algorithm is arranged in a driving cab and a judgment button is arranged on a steering wheel. At this time, the semi-automatic loading gear control stage is entered, but an excellent sweeper driver is required to perform loading operation guidance. When the camera shoots the garbage and extracts the scene information of the garbage through a corresponding algorithm, gear selection is performed by using the upper gear selection algorithm, and the gear label output by the algorithm is reflected through a corresponding indicator light signal. The driver responds to the gear selection given by the upper-mounted selection algorithm by using a judgment button arranged on a steering wheel according to the seen scene of the garbage according to experience, and presses a correct button if the selected gear is reasonable, so that the algorithm automatically realizes the gear shifting operation; if the selected gear is not reasonable, an 'error' button is pressed, and corresponding manual loading operation is carried out. And recording the data and using the data for further training of the off-line end loading selection algorithm, and after the off-line end training is finished, sending the latest parameters in the loading selection algorithm obtained after training to the on-line end through an OTA (over the air) technology to realize iterative updating of the loading selection algorithm.

(3) The actual application phase of algorithm development. At this time, an excellent sweeper driver is not needed to carry out loading operation, and only an ordinary driver needs to normally drive in a sweeping operation mode according to road conditions, and a loading selection algorithm can automatically select a loading gear and complete corresponding loading operation.

The online vehicle speed optimization control algorithm is developed on the basis of an upper gear selection algorithm, the optimal upper gear information provided by the upper gear selection algorithm is utilized, vehicle parameters obtained from a CAN bus are combined, online self-updating is realized through continuous interaction with the environment, and updating of a collective control algorithm is realized through a multi-agent system parameter fusion process at an off-line end.

The technical solution of the present invention is specifically described below by way of examples:

examples

the online end comprises a data online acquisition and storage system, an upper gear selection module and an online vehicle speed optimization control module, wherein the data online acquisition and storage system is used for acquiring and storing data of road garbage information on line, collecting the current upper gear operation of the sweeper and providing training data for the upper gear selection module; the upper gear selection module provides the best upper gear selection under the current garbage scene for the online vehicle speed optimization control module, and the best upper gear selection is used as one input of the online vehicle speed optimization control module, so that the decoupling of upper operation and vehicle speed control is realized, the online running vehicle speed is optimized in real time, and the sweeper is more energy-saving in running;

the off-line end comprises an upper gear selection training module based on a Deep Belief Network (DBN) and a vehicle speed optimization control neural network parameter fusion module, the upper gear selection training module based on the Deep Belief Network (DBN) performs upper gear selection algorithm training by using data collected at the end to obtain relevant parameters in an upper gear selection algorithm and sends the parameters to the upper gear selection module at the end for iteratively updating an upper gear control algorithm at the end; the vehicle speed optimization control neural network parameter fusion module fuses neural network parameters in the online vehicle speed optimization control modules of the multiple motor sweeper and transmits the fused neural network parameters to the motor sweeper, so that experience sharing of speed control of the multiple motor sweeper is realized;

The data online acquisition and storage system comprises a data acquisition module and a data storage module.

The data acquisition module is responsible for acquiring information from the front camera, the rear camera and the CAN bus, cleaning the acquired information, eliminating unreasonable data, realizing data synchronization, and constructing input data required by on-line gear and vehicle speed control on the basis of the data. The working steps of the data acquisition module are as follows:

step one, data discretization: the method comprises the following steps of carrying out discretization sampling on video information acquired by a front camera and a rear camera to form a picture sequence with the same time interval and a time label, and specifically comprising the following steps:

because the longitudinal length of the identification area of the camera is 8m, the common speed per hour is 8km/h when the sweeper works, and theoretically, the sweeper finishes one identification area in 3.6s, the scene sampling period can be selected to be 3s, so that the frequent switching of gears of the loading system is avoided, and the pavement information is ensured not to be missed. In summary, the video information collected by the camera in a periodic sampling manner and continuous in time is converted into a picture sequence with a sampling interval of 3s and a time tag.

For the vehicle speed information and the upper gear information transmitted by the CAN bus, because the vehicle speed information and the upper gear information are also transmitted at intervals by sampling, the data do not need to be discretized additionally, and only the time labels are used for matching with the data in the fourth step.

Step two, data cleaning: checking whether each attribute of the discretized picture sequence and data transmitted by a CAN bus is complete and reasonable, eliminating picture data with blocked pictures and filling up missing data attributes, and the concrete steps are as follows:

the detection of data quality emphasizes the detection of the integrity and the legality of data, namely whether each attribute of the data is complete and reasonable or not is detected. Aiming at discretized image data, firstly checking whether each picture has a corresponding time label, then screening out data acquired when a camera is shielded according to the proportion of black pixels, and rejecting all images and time labels thereof before a shielded image with a completely black picture, so as to ensure that the time of a subsequent image sequence is continuous and the picture is not shielded; for data transmitted by the CAN bus, whether the data is missing or not is checked, unreasonable data (such as time which is negative, the absolute value of speed which exceeds the maximum speed of the vehicle and the like) is screened out and replaced by '00'.

Step three, garbage feature extraction: the method comprises the steps of obtaining type information and position distribution information of the garbage by using a road garbage recognition algorithm based on fast-RCNN and image area division, obtaining coverage rate information before and after cleaning of each garbage by using a garbage coverage rate recognition algorithm and camera calibration data which are constructed in advance and aim at different vehicle types, and folding three types of information of the garbage type, the position distribution and the coverage rate before cleaning into a row vector which is called a garbage vector. And screening out scenes without identifying the junk objects, and replacing all elements in the junk species information corresponding to the scenes with labels of '00'.

The garbage characteristics comprise the type and distribution of garbage, the respective coverage rate and the cleanness of the cleaned road surface, and the specific process for acquiring the characteristics is as follows:

(1) extraction of garbage species characteristics

Based on machine vision, the method of artificial intelligence convolutional neural network is adopted to identify the type information of the road garbage. Marking a sample picture by using a picture marking tool, marking out a target window, establishing a sample library, and then establishing an intelligent road garbage recognition model by using a fast-RCNN algorithm, wherein the intelligent road garbage recognition model can recognize and obtain the type of garbage and the midpoint coordinate of a garbage anchor frame after receiving the image input of the road garbage, and the information of the midpoint of the garbage anchor frame is used for representing the spatial distribution of the garbage. And screening out scenes without identifying the junk objects, and replacing all elements in the junk species information corresponding to the scenes with labels of '00'.

(2) Extraction of junk distribution information

1-9 are respectively used for representing nine garbage objects, namely plastic bottles, orange peels, banana peels, plastic bags, milk bags, packaging bags, stones, cigarette cases and sandy soil. Dividing the image identified by the camera into n x m areas, initializing an n x m-dimensional zero matrix, judging the position of the area where the junk is located according to the extracted point coordinates of the junk, and replacing the value corresponding to the projection of the corresponding area on the zero matrix with the junk code so as to construct a junk distribution matrix. In the case of sufficiently fine area division, each element in the matrix can represent only one distribution of garbage. When in use, the garbage object is folded into n multiplied by m dimensional row vectors which are used as the input of the neural network.

(3) Extraction of trash coverage information

The installation position and the angle of the camera enable a special linear relation to exist between the pixel size of the garbage in the recognized image and the distance, and the special linear relation enables the same garbage to be different in size on the camera due to the fact that the distance between the same garbage and the camera is different. Therefore, corresponding camera calibration needs to be performed, specifically, a checkerboard method is used to segment a target detection area in an image, a far and near parameter matrix is calibrated by means of a test, large coverage compensation is performed on far garbage, small coverage compensation is performed on near garbage, and then a coverage compensation matrix of a camera is constructed to restore the real coverage of the garbage as far as possible, so that road surface garbage coverage detection under a unified scale is realized.

For N3 sweeper trucks with the same size specification, the installation positions of the cameras are uniform, so that the cameras installed on the sweeper trucks with the same size specification have the same camera calibration data. Therefore, only by obtaining a camera calibration database aiming at the N3 sweeper truck with each size specification in advance through an experimental method, the influence caused by different installation positions of cameras can be eliminated, the normalization processing of the garbage coverage rate data is realized, the garbage identification algorithm is suitable for the N3 sweeper trucks with all size specifications, and the universal applicability of the platform built by the method in the invention in the N3 sweeper trucks with all size specifications is ensured.

Therefore, a set of garbage coverage rate information extraction algorithm universally applicable to N3 sweeper trucks with all sizes and specifications can be obtained, and the process is as follows: firstly, carrying out shadow removal and image blocking on an input image, carrying out Gaussian denoising after blocking is finished, calculating a Sobel operator gradient and preliminarily segmenting a target and a background by adopting a binarization area threshold value; further, removing incoherent points in the segmented image by adopting a morphological depicting method; the method comprises the steps that the size specification of a vehicle is input on a user interaction interface, a coverage rate compensation matrix of the vehicle type with the corresponding size specification in a database is called, and normalization processing is carried out on coverage rate information in an image; and finally, extracting the information of the coverage rate of the road surface garbage based on the segmented image to obtain the proportion and the coverage rate of each type of garbage.

(4) Extraction of cleanliness information of road surfaces before and after cleaning

The sum of the coverage rates of various types of garbage on the road surface before and after cleaning is obtained from the image information obtained by the front and rear cameras by utilizing a road garbage coverage rate detection algorithm and is defined as the sum of the coverage rates of various types of garbage on the road surface before and after cleaning obtained by adopting a certain gearRoad surface cleanliness by S_kSpecifically, the following is shown:

wherein k is 0, 1, S₀Representing the degree of cleanliness of the road surface before sweeping, S₁Representing the cleanliness of the road surface after cleaning; m is_iThe coverage rate of the i-th trash is represented, and the trash represented by i ═ 1-9 is plastic bottles, orange peels, banana peels, plastic bags, milk bags, packaging bags, stones, cigarette cases and sandy soil.

Cleaning degree S of front and rear road surfaces₀、S₁And matching the subsequent data synchronization step with the corresponding upper gear, and taking the result as an important basis for subsequent optimization.

At this point, the information of the type, distribution and coverage rate of the junk objects under the same time unified coordinate can be obtained, and the three types of information are folded into a row vector which is called as a junk object vector.

Step four, data synchronization: and selecting a certain time point as a reference, matching various types of information according to time, and deleting rows with labels of '00' in the matched information, so that data of redundant time nodes and invalid training data of a clean road surface are eliminated, and synchronization of various types of data and construction of a scene array are realized. Each row in the scene array represents a scene vector, and the scene vector comprises junk information, vehicle running information and cleaned road surface cleanliness S identified in a certain sampling period₁The method comprises the following specific steps:

and selecting a time point as a reference, and matching the type, the position, the coverage rate before and after cleaning, the vehicle speed and the gear information of the upper loader according to the time point to form a scene array. Each line in the scene array represents a scene vector corresponding to a certain time point, and the scene vector comprises a garbage vector, a vehicle speed, an upper gear and a road garbage coverage rate after cleaning by the gear. Because the sampling period of the junk object vector is longer than that of data transmitted from the CAN bus, the junk object vector is lost at some time points, and the corresponding position is filled by '00'.

Because a longer longitudinal distance exists between the camera identification area and the loading mechanism operation area, gear information corresponding to each garbage object vector in the scene vector is not real-time gear information when the garbage object vector is acquired, and the corresponding gear information should lag behind a time interval delta t₁，Δt₁The specific formula of (A) is as follows:

in the formula, L is the longitudinal distance between the near end of the camera identification area and the upper system operation area; v is the vehicle speed contained in the scene information, and the time interval can be calculated by using the vehicle speed because the vehicle speed is not changed greatly in the operation process of the sweeper in the normal operation process; t is t₁Taking 0.5s as program response time; t is t₂For the response time of the loading system, 1s is taken.

Similarly, a time interval delta t also exists between the road surface cleanliness S after cleaning and the garbage object vector collected in real time₂The time interval is the time interval that the front camera and the rear camera respectively shoot the same road surface area, and can be represented by the following formula:

wherein L is₁Is the distance between two camera mounting positions, L₂The length of the camera view range, and v is the real-time vehicle speed.

Due to the lag of the information of the cleanliness of the gears and the cleaned road surfaces, the construction completion time of each scene vector in the scene array can be relatively lagged (delta t)₂+1) s, while guaranteeing to install the acquisition of keeping off the position and road surface cleanliness information after cleaning, do not influence the real-time acquisition of rubbish thing information yet.

And finally, deleting the row where the 00 label is positioned by taking the 00 label as a screening standard, and eliminating redundant data and invalid training data of a clean road surface, thereby completing the construction of a feature array at the same time interval with the garbage information sampling and realizing the synchronization among all data.

The data storage module provides temporary storage space for the data constructed by the data acquisition module and the operation video of the current day, screens the data, screens out operation scenes with unsatisfactory operation effect, changes gear information of the operation scenes, and stores the data into a designated area for the offline end to relearn the operation scene information. The specific working process of the data storage module is shown in fig. 1, and the steps are as follows:

step one, extracting the garbage coverage rate (S respectively) before and after cleaning for the cleaning data of the same day₀、S₁) Loading the gear and calculating (S)₀-S₁) The value of (c).

Step two, press S₁<10% and S₁>10% of the whole data are classified;

step three, summarizing S₁<10% of different gears and (S)₀-S₁) Correspondence between the intervals;

step four, when S₁>At 10%, the scene vector is copied from the scene array, and [ S ] is calculated₀,(S₀-5％)]In which gear corresponds to (S)₀-S₁) And taking the gear label corresponding to the interval as the alternative gear for data modification in the interval.

Step five, if the alternative gear label is larger than the original gear label and smaller than the highest gear label, replacing the original gear label with the alternative gear label; if the alternative gear label is smaller than or equal to the original gear label, the original gear is lifted by one gear; and if the alternative gear is larger than the highest gear tag, extracting the time tag, carrying out picture interception in the cached video information according to the time tag, packaging and storing the screenshot and the corresponding scene vector, and finally sending the screenshot and the corresponding scene vector to a cloud for manual analysis processing.

And step six, storing the scene vector with the upgraded gear data into a designated area, and periodically sending the scene vector to an off-line end through an OTA communication module for relearning the operation scene information by the off-line end.

The upper gear selection module and an upper gear selection training module of a depth confidence network (DBN) at an off-line end are used for data transmission, the upper gear selection of the sweeper is realized, the algorithm flow of the upper gear selection is shown in figure 2, and the specific algorithm steps are as follows:

secondly, performing loading gear selection training and parameter fusion by an loading gear selection training module based on a Deep Belief Network (DBN) at an offline end:

An online vehicle speed optimization control algorithm based on a depth deterministic strategy gradient (DDPG) and a multi-agent system is adopted in the online vehicle speed optimization control module, the algorithm flow is shown in figure 6, and the algorithm steps are as follows:

step one, when the online work is carried out, the gear and the vehicle speed are controlled in real time based on the DDPG algorithm, and the neural network parameters of the driver are continuously updated.

Step two, when all the cleaning vehicles finish working, the respective neural network parameters and the average rate of return R are used_jUploading to an off-line end, fusing the parameters of each vehicle by using a parameter fusion mode in an off-line end common control system, and updating the target network parameters.

Specifically, the on-line end vehicle speed and gear control process in the step one is as follows:

(1) and downloading the Target neural Network parameters in the public control system to the sweeping vehicle controller to serve as initial parameters of the Actor Network and the critical Network and parameters of the Actor Target Network and the critical Target Network of the sweeping work.

(2) When the sweeper works, the optimal loading gear at the moment is obtained from the DBN network of the loading gear selection module at each moment. During initial work, vehicle state parameters such as an upper gear, a battery SOC, a vehicle running gradient, vehicle required power and the like are jointly input to an Actor Network, and the optimal vehicle speed and the gear a at the moment are output_tAnd is used for vehicle control.

(3) Then, the optimal vehicle speed and the gear a at the moment are measured_tAnd a vehicle state parameter s_tInputting the input into a Critic Network and outputting a corresponding action value function Q_t(s_t,a_t)

(4) The vehicle state parameter s at the next moment_t+1Input to the Actor Target Network, and output the optimal speed and gear a_t+1(ii) a Will s_t+1And a_t+1The common inputs are input into a critical Target Network, and the action value function Q at the moment is output_t+1(s_t+1,a_t+1) And calculating a return function value R at the time, which is a weighted function of the cleaning rate and the energy utilization rate, where P_{Is effective}Power for the vehicle to overcome rolling resistance, P_{Drive the}Beta and gamma are weights of cleaning rate and energy utilization rate for total required power of the vehicle.

(5) And updating the Critic Network parameter omega by the gradient back propagation of the neural Network by using a mean square error loss function J (omega) every N steps.

(6) And updating the Actor Network parameter theta.

The off-line end comprises an upper gear selection training module based on a Deep Belief Network (DBN) and a vehicle speed optimization control neural network parameter fusion module.

The method is characterized in that the data transmission is carried out between the upper gear selection training module based on the Deep Belief Network (DBN) and the upper gear selection module at the line end, the upper gear selection of the sweeper is realized together, the algorithm flow of the upper gear selection is shown in figure 2, wherein the specific working steps of the upper gear selection training module based on the Deep Belief Network (DBN) are as follows:

Specifically, the parallel training process in step 3) is shown in fig. 3, and the sweeper in this example has three loading gears, so that k is 3 in the corresponding process. The whole parallel training steps are as follows:

(1) the parameter scheduler distributes DBN initial parameters to 3 parallel DBN networks, and initializes parameters (theta is { W, b, c }, wherein W is a weight matrix, b is an offset of a visible node, and c is an offset of an implicit node) of the parallel DBN networks;

(2) respectively importing training data with gear labels of '1 gear', '2 gear' and '3 gear' into 3 parallel DBN networks, and training data of one gear label by each DBN network. The structure of a single DBN network and the training process are shown in fig. 4 and 5, respectively.

(3) After training is completed, the parameter of each DBN network changes compared with the initial value, and this change is denoted as Δ θ. And the parameter scheduler summarizes the delta theta generated after each DBN network is trained so as to update the original DBN parameters.

The vehicle speed optimization control neural network parameter fusion module performs data transmission with the on-line vehicle speed optimization control module at the on-line end to realize neural network parameter fusion of the sweeper vehicle speed optimization control module, and the neural network parameter fusion process is as follows:

(1) uploading neural network parameters of the multi-agent system: when all the cleaning vehicles finish working, the respective neural network parameters theta are calculated₁,θ₂…θ_l,ω₁,ω₂…ω_lAnd average rate of return

Uploading to an off-line end through an OTA communication module;

It is determined that,

the control information updating system realizes real-time control information interaction between a line terminal and an off-line terminal by depending on an OTA Technology (Over-the-Air Technology, spatial downloading Technology), and specifically comprises a data uploading module and a control program upgrading and updating module based on the OTA Technology. In order to realize the corresponding functions, the hardware implementation architecture constructed at the line end is shown in fig. 7.

The data uploading module realizes the decoupling between the line end and the off-line end by setting a cache region between the line end and the off-line end, and the communication logic between the line end and the off-line end adopts a 'producer-consumer' mode, and the specific process is as follows:

(1) the online terminal checks whether the current vehicle and network state can upload data;

(5) the offline terminal extracts data from the cache region;

(6) and clearing the uploaded data at the online end, releasing the storage space and storing the data in the next round.

The control program upgrading and updating module is used for upgrading the controller in a dual-system backup redundancy mode in order to avoid the risk of the crash of the controller caused by upgrading failure in the software program upgrading process. When the system A is upgraded, the system B performs redundancy backup and maintains the normal working capability of the system. And after the system A is successfully upgraded, the system A is debugged live, and the system B is still in an online state at the moment so as to ensure that the system B immediately takes over the actual control right to ensure the normal running of the vehicle when the system A is debugged and failed. As shown in fig. 8, the specific process of the control program upgrading and updating module is as follows:

(1) and after the off-line end completes training by using the training end transmitted from the on-line end, packaging the trained neural network parameters and distributing the parameters to the OTA management platform.

(2) And bidirectional identity verification is carried out between the OTA management platform and the managed sweeper, and a new algorithm to be updated and the version number of the new algorithm are broadcasted.

(3) And comparing the algorithm version number of the vehicle on the line end, and detecting whether the current vehicle condition is suitable for updating.

(4) And after the line terminal confirms that the current vehicle condition is suitable for updating, sending an updating request to the offline terminal.

(5) The off-line end transmits the algorithm updating packet to the on-line end in an encrypted manner; the integrity of the data packet is verified after the line termination is subject to the algorithm update packet.

(6) After the integrity check is passed, the update data of each algorithm is distributed to the respective controller.

(7) And each controller updates the algorithm by using the updating data in a dual-system backup redundancy upgrading mode.

(8) And after the algorithm is updated, debugging the actual road operation, and reporting upgrading success information and the current version number to an offline end at the end of the line after the debugging is successful.

(9) And finishing updating and releasing the network resources.

Claims

1. A data storage and backtracking platform for an N3-class electric sweeper is characterized by comprising an on-line end and an off-line end;

2. The data storage and backtracking platform for the N3-class electric sweeper as claimed in claim 1, wherein the data online acquisition and storage system comprises a data acquisition module and a data storage module, the data acquisition module is responsible for acquiring information from front and rear cameras and a CAN bus of the sweeper, cleaning the acquired information, eliminating unreasonable data, realizing data synchronization, and constructing input data required by an upper gear selection module and an online vehicle speed optimization control module; the data storage module provides temporary storage space for the data constructed by the data acquisition module and the operation video of the current day, screens the data, screens out operation scenes with unsatisfactory operation effect, changes gear information of the operation scenes, and stores the data in a designated area for the offline end to relearn the operation scene information.

3. The data storage and backtracking platform for the N3 electric sweeper as claimed in claim 2, wherein the data acquisition module is operated as follows:

4. The data storage and backtracking platform of the N3-class electric sweeper truck as claimed in claim 1, wherein the upper-mounted gear selection module performs data transmission with an upper-mounted gear selection training module of a deep confidence network at an off-line end to realize the upper-mounted gear selection of the sweeper truck, and the upper-mounted gear selection algorithm comprises the following steps:

5. The data storage and backtracking platform for the class-N3 electric sweeper according to claim 4, wherein the specific working steps of the upper gear selection training module based on the deep belief network are as follows:

6. The data storage and backtracking platform for the class N3 electric sweeper according to claim 5, wherein the parallel training step in the step 3) is as follows:

7. The data storage and backtracking platform for the N3-class electric sweeper as claimed in claim 1, wherein the online vehicle speed optimization control module adopts an online vehicle speed optimization control algorithm based on a depth-deterministic strategy gradient and a multi-agent system, and the algorithm steps are as follows:

step two, when all the cleaning vehicles finish working, the respective neural network parameters and the average rate of return are used

Uploading the parameters to an off-line end, and fusing the parameters of all vehicles by using a parameter fusion mode in an off-line end common control systemUpdating the target network parameters;

8. The data storage and backtracking platform for the class-N3 electric sweeper of claim 7, wherein the gear and vehicle speed control process in the step one is as follows:

(2) when the sweeper works, the optimal loading gear at the moment is obtained from the DBN network of the loading gear selection module at each moment; during initial work, the upper gear and the vehicle state parameters are input to the Actor Network together, and the optimal vehicle speed and the gear a at the moment are output_tFor vehicle control;

the return function value R is a weighted function of the cleaning rate and the energy utilization rate, where P_{Is effective}Power for the vehicle to overcome rolling resistance, P_{Drive the}For total power demand of the vehicle, betaGamma is the weight of the cleaning rate and the energy utilization rate;

(6) updating an Actor Network parameter theta:

9. the data storage and backtracking platform of the N3-class electric sweeper facing thereto according to claim 7, wherein the vehicle speed optimization control neural network parameter fusion module performs data transmission with the on-line vehicle speed optimization control module at the end of the vehicle to realize neural network parameter fusion of the sweeper vehicle speed optimization control module, and the neural network parameter fusion process is as follows:

Uploading to an off-line end through an OTA communication module;

It is determined that,

10. the data storage and backtracking platform for the N3-class electric sweeper of claim 1, wherein the control information updating system comprises a data uploading module and a control program upgrading and updating module based on OTA technology;

(5) the offline terminal extracts data from the cache region;

(9) and finishing updating and releasing the network resources.