CN111145538A

CN111145538A - Stereo perception system suitable for audio and video acquisition, recognition and monitoring on highway

Info

Publication number: CN111145538A
Application number: CN201911244491.3A
Authority: CN
Inventors: 徐清峻; 王风春; 刘扬; 王晓东; 张刚刚; 梁昭
Original assignee: Qilu Traffic Information Group Co ltd
Current assignee: Qilu Traffic Information Group Co ltd
Priority date: 2019-12-06
Filing date: 2019-12-06
Publication date: 2020-05-12

Abstract

The invention provides a stereo perception system suitable for audio and video acquisition, identification and monitoring on a highway, which comprises a plurality of groups of stereo perception components and a processor module, wherein different stereo perception components are arranged at different positions of the highway, and monitoring areas of two adjacent groups of stereo perception components have overlapping parts; set up three-dimensional perception system and carry out three-dimensional perception to environment and traffic condition on the highway, video monitoring wherein can play like the effect of patrolling and examining personnel's eyes, and audio monitoring can play like the effect of patrolling and examining personnel's ear to because the monitoring area of different three-dimensional perception subassemblies has overlap portion, so can ensure that three-dimensional perception subassembly has covered the whole scope of highway, consequently can compensate the artifical problem of patrolling and examining the unable real-time monitoring to all regions of in-process.

Description

Stereo perception system suitable for audio and video acquisition, recognition and monitoring on highway

Technical Field

The invention relates to the technical field of intelligent traffic control, in particular to a stereo perception system suitable for audio and video acquisition, identification and monitoring on a road.

Background

At present, cameras are installed on roads at intervals, and running vehicles are continuously shot through the cameras, so that whether the vehicles have illegal behaviors or not can be determined. In this way, in order to obtain an effective and clear image, the requirement on visibility is high, and although a light source is installed beside a plurality of cameras for auxiliary lighting to improve the visibility of the cameras during shooting, the effect on foggy weather, rainy and snowy weather and the like still needs to be improved.

In addition, the road information collection in the prior art only depends on the video information collected by the camera, and is not comprehensive. The safety early warning is still realized by more police force personnel, and higher labor cost is consumed.

Disclosure of Invention

Therefore, the invention aims to solve the technical problems that the road information acquisition system in the prior art cannot realize comprehensive acquisition of road information and needs to consume larger labor cost, and further provides the stereo perception system suitable for audio and video acquisition, identification and monitoring on the road.

In order to solve the technical problems, the invention provides a stereo perception system suitable for audio and video acquisition, identification and monitoring on a road, which comprises a plurality of groups of stereo perception components and a processor module, wherein different stereo perception components are arranged at different positions of the road, and monitoring areas of two adjacent groups of stereo perception components are provided with overlapping parts; wherein:

a first machine learning model and a second machine learning model are preset in the processor module, the first machine learning model is used for receiving video signals and outputting analysis results aiming at the video signals, and the second machine learning model is used for receiving audio signals and outputting analysis results aiming at the audio signals;

the stereo perception component comprises a video detector and an audio detector; the video detector is used for collecting video detection signals in a monitoring area and outputting the video detection signals to the processor module; the audio detector is used for collecting audio detection signals in a monitoring area and outputting the audio detection signals to the processor module;

the processor module receives the video detection signal and the audio detection signal, analyzes the video detection signal through the first machine learning model and outputs a first analysis result aiming at the video detection signal, and analyzes the audio detection signal through the second machine learning model and outputs a second analysis result aiming at the audio detection signal;

and acquiring a three-dimensional sensing result corresponding to the corresponding detection area according to the detection information obtained by fusing the first analysis result and the second analysis result.

Preferably, in the above stereoscopic sensing system for audio-video acquisition, identification and monitoring on roads:

the three-dimensional sensing assembly also comprises an ultrasonic detector, and the ultrasonic detector is used for detecting the state of the obstacles in the monitoring area and outputting an ultrasonic detection result to the processor module;

an ultrasonic signal analysis model is preset in the processor module and analyzes the ultrasonic detection result to determine the state of a first obstacle in the monitored area;

the processor module further comprises a verification module, the verification module receives a first analysis result output by the first machine learning model and a first obstacle state output by the ultrasonic signal analysis model, verifies the first analysis result by using the first obstacle state, and determines that the first analysis result is valid if the consistency between the first obstacle state and the first analysis result exceeds a set value.

the stereoscopic sensing assembly further comprises an infrared detector, and the infrared detector is used for detecting the state of the obstacles in the monitoring area and outputting an infrared detection result to the processor module;

an infrared signal analysis model is preset in the processor module and analyzes the infrared detection result to determine the state of the barrier in the monitored area;

the verification module is further used for receiving a first analysis result output by the first machine learning model and an obstacle state output by the infrared signal analysis model, verifying the first analysis result by using the obstacle state, and judging that the first analysis result is valid if the consistency between the obstacle state and the first analysis result exceeds a set value.

the processor module is also provided with a training sample storage unit; the training sample storage unit is used for storing training sample data, and the training sample data comprises sample video data and sample audio data; the sample video data is used to train the first machine learning model; the sample audio data is used to train the second machine learning model.

the processor module is also provided with a test sample storage unit; the test sample storage unit is used for storing test sample data, and the test sample data comprises video test data and audio test data; the video test data is used for testing the trained first machine learning model; the audio test data is used for testing the trained second machine learning model.

Preferably, in the above stereo perception system for audio-video capture, identification and monitoring on a road, further comprising:

the electric energy output end of the solar cell module is electrically connected with the three-dimensional sensing module and the power supply ends of the processor module; and/or the presence of a gas in the gas,

and the electric energy output end of the wind energy battery component is electrically connected with the stereoscopic sensing component and the power supply end of the processor module.

the electronic fence comprises a coil sensor buried at a set distance around the stereo sensing assembly; the coil sensor detects a pressure value applied to the coil sensor, and if the pressure value exceeds a set pressure threshold value, the coil sensor sends an alarm signal to the processor module.

the vibration sensor is arranged on the packaging shell of the three-dimensional sensing assembly; the vibration sensor detects the vibration frequency or the vibration amplitude of the three-dimensional sensing assembly, and if the vibration frequency exceeds a frequency threshold value or the vibration amplitude exceeds an amplitude threshold value, the vibration sensor sends an alarm signal to the processor module.

Preferably, in the above stereoscopic sensing system suitable for audio and video acquisition, identification and monitoring on a road, the system further comprises an early warning module and a prompt module:

the early warning module receives a first analysis result output by the first machine learning model, a second analysis result output by the second machine learning model, an ultrasonic detection result output by an ultrasonic detector and an infrared detection result output by an infrared detector, and predicts the road traffic condition according to the first analysis result, the second analysis result, the ultrasonic detection result and the infrared detection result;

and controlling the prompt module to send out a prompt signal at a set time point according to the predicted road traffic condition.

Preferably, in the stereo perception system suitable for audio-video acquisition, identification and monitoring on roads:

the prompting module comprises a high-pitch player and/or an optical spotlight, and the high-pitch loudspeaker is used for playing specific voice; the optical spotlight is used for focusing the area where the vehicle is located.

Compared with the prior art, the technical scheme of the invention at least has the following beneficial effects:

(1) the stereo perception system is arranged on the highway to stereoscopically perceive the environment and traffic conditions on the highway, wherein the video monitoring can play a role like eyes of inspectors, the audio monitoring can play a role like ears of the inspectors, and the monitoring areas of different stereo perception components have overlapping parts, so that the stereo perception components can be ensured to cover the whole range of the highway, and the problem that all areas cannot be monitored in real time in the manual inspection process can be solved.

(2) The stereo perception component can comprise a plurality of perception modes, and can verify the video monitoring result in an ultrasonic monitoring mode and an infrared monitoring mode besides two monitoring modes of video and audio.

(3) The stereo perception system suitable for audio and video acquisition, identification and monitoring on the highway is provided with a green energy source power supply such as solar energy and wind energy, so that the waste of energy can be avoided.

(4) The stereo perception system suitable for audio and video acquisition, identification and monitoring on the highway is provided with an anti-theft detection mode, whether the stereo perception component has the risk of being stolen is detected through electronic fence or vibration detection and other modes, an alarm signal can be sent to the processor when the risk is detected, and accordingly, the processor can adopt various anti-theft measures, such as sending a signal to nearby traffic police, so that the traffic police can arrive in time, meanwhile, the stereo perception component is prevented from being stolen by reminding thieves with sound.

(5) The stereo perception system suitable for audio and video acquisition, identification and monitoring on the highway is also provided with the early warning module, and the early warning module can estimate the traffic change condition according to a series of acquired information, so that upcoming vehicles can be reminded in advance, and the traffic safety is further improved.

Drawings

In order that the present disclosure may be more readily and clearly understood, reference is now made to the following detailed description of the embodiments of the present disclosure taken in conjunction with the accompanying drawings, in which

Fig. 1 is a schematic block diagram of a stereo perception system suitable for audio/video acquisition, identification and monitoring on a highway according to an embodiment of the present invention;

fig. 2 is a schematic block diagram of a stereo perception system suitable for audio-video acquisition, recognition and monitoring on a road according to another embodiment of the present invention;

fig. 3 is a schematic diagram of a hardware connection structure of a stereo perception system for audio and video acquisition, identification and monitoring according to an embodiment of the present invention.

Detailed Description

The embodiments of the present invention will be further described with reference to the accompanying drawings. In the description of the present invention, it should be noted that the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc., indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, and are only used for convenience of description of the present invention, but do not indicate or imply that the device or component being referred to must have a specific orientation, be constructed in a specific orientation, and be operated, and thus, should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Wherein the terms "first position" and "second position" are two different positions.

In the description of the present invention, it should be noted that, unless otherwise explicitly specified or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; the two components can be directly connected or indirectly connected through an intermediate medium, and the two components can be communicated with each other. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.

The technical solutions provided in the following embodiments of the present invention may be combined with each other unless contradicted by each other, and the technical features thereof may be replaced with each other.

Example 1

The embodiment provides a stereo perception system suitable for audio and video acquisition, identification and monitoring on a road, as shown in fig. 1, the stereo perception system comprises a plurality of stereo perception assemblies 10 and a processor module 20, different stereo perception assemblies 10 are arranged at different positions of the road, and monitoring areas of two adjacent stereo perception assemblies 10 are provided with overlapping parts. In particular, reference is made to fig. 2, in which:

a first machine learning model 201 and a second machine learning model 202 are preset in the processor module 20, the first machine learning model 201 is used for receiving a video signal and outputting an analysis result aiming at the video signal, and the second machine learning model 202 is used for receiving an audio signal and outputting an analysis result aiming at the audio signal;

the stereo perception component 10 comprises a video detector 101 and an audio detector 102; the video detector 101 is configured to collect video detection signals in a monitoring area and output the video detection signals to the processor module 20; the audio detector 102 is configured to collect an audio detection signal in a monitoring area and output the audio detection signal to the processor module 20;

the processor module 20 receives the video detection signal and the audio detection signal, parses the video detection signal through the first machine learning 201 model and outputs a first parsing result for the video detection signal, parses the audio detection signal through the second machine learning model 202 and outputs a second parsing result for the audio detection signal; and acquiring a stereoscopic perception result corresponding to the corresponding detection area according to the detection information obtained by fusing the first analysis result and the second analysis result.

In the above solution, the video detector 201 may select a high-definition camera, and the audio detector 202 may select a high-precision pickup. The first machine learning model 201 and the second machine learning model 202 may be mathematical models with an autonomous learning function, such as a neural network model, which are available in the prior art, and the machine learning models can autonomously adjust the weight parameters therein under the condition of training a large number of samples, and finally adjust the weight values to appropriate values. After inputting the actual data with the same type as the sample data, the actual data can be analyzed through the machine learning model, and therefore the desired analysis result is output. In the scheme, the first machine learning model 201 is used for identifying the video detection result, the second machine learning model 202 is used for identifying the audio detection result, and the first machine learning model 201 and the second machine learning model 202 are trained through a large number of video signals and audio signals before being used. Therefore, it is preferable that a training sample storage unit is further disposed in the processor module 20; the training sample storage unit is used for storing training sample data, and the training sample data comprises sample video data and sample audio data; the sample video data is used to train the first machine learning model 201; the sample audio data is used to train the second machine learning model 202. A test sample storage unit is also arranged in the processor module 20; the test sample storage unit is used for storing test sample data, and the test sample data comprises video test data and audio test data; the video test data is used for testing the trained first machine learning model 201; the audio test data is used to test the trained second machine learning model 202. That is, the trained machine learning model can be tested through a test sample, the test sample comprises input data and ideal output data, the input data is directly input into the machine learning model during testing, then actual output data is obtained, the actual output data is compared with the ideal output data, and whether the machine training model is appropriate or not is judged.

In this scheme, the data detected by the video detector 101 and the audio detector 102 are also stored as new sample data in the training sample storage unit in the processor module 20. The process of acquiring the stereoscopic perception result corresponding to the corresponding detection area according to the detection information obtained by fusing the first analysis result and the second analysis result may be directly associating the first analysis result and the second analysis result, that is, the video detection result and the audio detection result are associated and stored with the area number or the geographic position of the monitoring area. The video detection result may include more information, such as weather, road markings, guardrails, plants, and the like. The audio detection result may include a whistling sound of the vehicle, a warning light sound of a special vehicle, and the like.

According to the scheme, the three-dimensional sensing system is arranged on the highway to sense the environment and traffic conditions on the highway in a three-dimensional mode, the video monitoring can play the role of eyes of inspection personnel, the audio monitoring can play the role of ears of inspection personnel, and the monitoring areas of the different three-dimensional sensing assemblies are overlapped, so that the three-dimensional sensing assemblies can cover the whole range of the highway, and the problem that all areas cannot be monitored in real time in the manual inspection process can be solved

Example 2

As shown in fig. 2, the stereo sensing system suitable for audio and video acquisition, identification, and monitoring on a road provided in this embodiment further includes an ultrasonic detector 103, where the ultrasonic detector 103 is configured to detect a state of an obstacle in a monitored area and output an ultrasonic detection result to the processor module 20; an ultrasonic signal analysis model 203 is preset in the processor module 20, and the ultrasonic signal analysis model 203 analyzes the ultrasonic detection result to determine a first barrier state in the monitored area;

the stereo perception component 10 further includes an infrared detector 104, and the infrared detector 104 is configured to detect a state of an obstacle in a monitored area and output an infrared detection result to the processor module 20; an infrared signal analysis model 204 is preset in the processor module 20, and the infrared signal analysis model 204 analyzes the infrared detection result to determine the state of the obstacle in the monitored area;

the processor module 20 further comprises a verification module 205, wherein the verification module 205 receives a first analysis result output by the first machine learning model 101 and a first obstacle state output by the ultrasonic signal analysis model 203, verifies the first analysis result by using the first obstacle state, and determines that the first analysis result is valid if the consistency between the first obstacle state and the first analysis result exceeds a set value.

The verification module 205 is further configured to receive a first analysis result output by the first machine learning model 101 and an obstacle state output by the infrared signal analysis model 104, verify the first analysis result by using the obstacle state, and determine that the first analysis result is valid if the consistency between the obstacle state and the first analysis result exceeds a set value.

The video detection result can extract which position in the monitoring area has the barrier, such as the existing vehicle, the street lamp pole, the isolation pile and the like, and the ultrasonic wave and the infrared detection can also detect which position in the monitoring area has the barrier. The accuracy of the video detection effect is slightly poor when the visibility is low, and the ultrasonic wave and the infrared wave can monitor a relatively accurate result at the moment, so that the accuracy of the video detection result can be verified through the detection results of the ultrasonic wave and the infrared wave when the visibility is low. Preferably, when the visibility is lower than a certain value, the result of the ultrasonic wave or infrared detection can be directly adopted to replace the result of the video detection. Because the result of ultrasonic wave and infrared detection receives the influence of thunderstorm weather easily, consequently can be as final detection result through the integration result of three testing results when thunderstorm weather, can assist discernment by the staff.

Further preferably, the above scheme further comprises a solar cell module, wherein an electric energy output end of the solar cell module is electrically connected with power supply ends of the stereoscopic sensing assembly 10 and the processor module 20; and/or a wind energy battery assembly, wherein an electric energy output end of the wind energy battery assembly is electrically connected with power supply ends of the stereoscopic sensing assembly 10 and the processor module 20. This scheme is provided with green energy power, can avoid the waste of the energy.

In addition, in the above solution, the stereo sensing device may further include an electronic fence, where the electronic fence includes a coil sensor embedded at a set distance around the stereo sensing assembly; the coil sensor detects a pressure value applied thereto, and if the pressure value exceeds a set pressure threshold, the coil sensor sends an alarm signal to the processor module 20. As another scheme that can be realized:

the above scheme may further include: the vibration sensor is arranged on the packaging shell of the three-dimensional sensing assembly; the vibration sensor detects the vibration frequency or the vibration amplitude of the stereo perception component, and if the vibration frequency exceeds a frequency threshold or the vibration amplitude exceeds an amplitude threshold, the vibration sensor sends an alarm signal to the processor module 20. Whether there is stolen risk in the mode detection stereo perception subassembly such as through fence or vibration detection, can send alarm signal to the treater when detecting at risk, correspondingly, the treater can adopt multiple anti-theft measure, for example sends the signal to near traffic police, makes the traffic police arrive in time, is assisted with sound simultaneously and reminds stealer etc. avoids stereo perception subassembly to be stolen.

Further, the above scheme further includes an early warning module and a prompt module, where the early warning module receives a first analysis result output by the first machine learning model 101, a second analysis result output by the second machine learning model 102, an ultrasonic detection result output by the ultrasonic detector 103, and an infrared detection result output by the infrared detector 104, and predicts a road traffic condition according to the first analysis result, the second analysis result, the ultrasonic detection result, and the infrared detection result; and controlling the prompting module to send out a prompting signal at a set time point according to the predicted road traffic condition. The prompting module comprises a high-pitch player and/or an optical spotlight, and the high-pitch loudspeaker is used for playing specific voice; the optical spotlight is used for focusing the area where the vehicle is located.

For example, the position range of the vehicle in the current monitoring area can be predicted through the machine learning model and the ultrasonic and infrared detection results, if the monitoring area has a traffic accident, information can be timely prompted to the vehicle when the vehicle reaches the monitoring area, and lane changing preparation and the like of the vehicle are made in advance.

Example 3

The stereo perception system suitable for audio and video acquisition, identification and monitoring on the highway provided by the embodiment comprises a processor, a memory and a communication component. Generally, the memory stores specific code, which is specifically executed by the processor, and the communication component is used for communicating with other terminal devices.

Fig. 3 is a schematic diagram of a hardware structure of an electronic device of a stereo perception system suitable for audio and video acquisition, identification and monitoring on a road according to this embodiment, and as shown in fig. 3, the device includes:

one or more processors 8 and a memory 9, one processor 8 being exemplified in fig. 3. The method can also comprise the following steps: a receiving component 10 and a sending component 11.

The processor 8, the memory 9, the receiving component 10 and the sending component 11 may be connected by a bus or other means, which is exemplified in fig. 3.

The memory 9, which is a non-volatile computer-readable storage medium, may be used to store non-volatile software programs, non-volatile computer-executable programs, and modules, such as the corresponding program instructions/modules in embodiment 1 or 2 in the embodiments of the present application. The processor 8 executes various functional applications of the server and data processing by running nonvolatile software programs, instructions, and modules stored in the memory 9, that is, implements the solution of the above-described embodiment.

The memory 9 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created by use of the apparatus according to embodiment 2, and the like. Further, the memory 9 may include a high speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, the memory 9 may optionally include memory located remotely from the processor 8, which may be connected to the apparatus of embodiments 1 or 2 via a network.

In addition, the logic instructions in the memory may be stored in a computer readable storage medium when the logic instructions are implemented in the form of software functional units and sold or used as independent products. Based on such understanding, the technical solution of the present invention or a part of the technical solution that contributes to the prior art in essence can be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a mobile terminal (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present invention. Can be understood and implemented by those skilled in the art without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that the embodiments may be implemented by software plus a necessary general hardware platform, and may also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the embodiments of the present invention, and not to limit the same; although embodiments of the present invention have been described in detail with reference to the foregoing embodiments, it should be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A stereoscopic sensing system suitable for audio and video acquisition, identification and monitoring on a highway is characterized by comprising a plurality of groups of stereoscopic sensing assemblies and a processor module, wherein different stereoscopic sensing assemblies are arranged at different positions of the highway, and monitoring areas of two adjacent groups of stereoscopic sensing assemblies are provided with overlapping parts; wherein:

2. The stereo perception system adapted for audio-visual capture, identification and monitoring on highways of claim 1, wherein:

an ultrasonic signal analysis model is preset in the processor module and analyzes the ultrasonic detection result to determine a first barrier state in the monitored area;

3. The stereo perception system adapted for audio-visual capture, identification and monitoring on highways of claim 2, wherein:

4. The stereo perception system for audio-visual capture, identification and monitoring on highways of claim 1, wherein:

5. The stereo perception system for audio-visual capture, identification and monitoring on highways of claim 4, wherein:

the processor module is also provided with a test sample storage unit; the test sample storage unit is used for storing test sample data, and the test sample data comprises video test data and audio test data; the video test data is used for testing the trained first machine learning model; and the audio test data is used for testing the trained second machine learning model.

6. The stereo perception system for audio-visual capture, identification and monitoring on highways according to any of claims 1-5, further comprising:

and the electric energy output end of the wind energy battery assembly is electrically connected with the stereoscopic sensing assembly and the power supply end of the processor module.

7. The stereo perception system for audio-visual capture, identification and monitoring on a highway of claim 6 further comprising:

8. The stereo perception system for audio-visual capture, identification and monitoring on a highway according to claim 7 further comprising:

the vibration sensor is arranged on the packaging shell of the three-dimensional sensing assembly; the vibration sensor detects the vibration frequency or the vibration amplitude of the stereoscopic perception assembly, and if the vibration frequency exceeds a frequency threshold or the vibration amplitude exceeds an amplitude threshold, the vibration sensor sends an alarm signal to the processor module.

9. The stereo perception system for audio and video acquisition, identification and monitoring on highways of claim 8 further comprising an early warning module and a prompt module:

and controlling the prompting module to send out a prompting signal at a set time point according to the predicted road traffic condition.

10. The stereo perception system for audio-visual capture, identification and monitoring on highways of claim 9, wherein: