WO2018201349A1 - 一种应急车辆的识别方法及装置 - Google Patents

一种应急车辆的识别方法及装置 Download PDF

Info

Publication number
WO2018201349A1
WO2018201349A1 PCT/CN2017/082915 CN2017082915W WO2018201349A1 WO 2018201349 A1 WO2018201349 A1 WO 2018201349A1 CN 2017082915 W CN2017082915 W CN 2017082915W WO 2018201349 A1 WO2018201349 A1 WO 2018201349A1
Authority
WO
WIPO (PCT)
Prior art keywords
emergency vehicle
sound
sound signal
auditory
type
Prior art date
Application number
PCT/CN2017/082915
Other languages
English (en)
French (fr)
Inventor
宋风龙
刘浏
汪涛
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN201780082732.1A priority Critical patent/CN110168625A/zh
Priority to PCT/CN2017/082915 priority patent/WO2018201349A1/zh
Publication of WO2018201349A1 publication Critical patent/WO2018201349A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B3/00Audible signalling systems; Audible personal calling systems
    • G08B3/10Audible signalling systems; Audible personal calling systems using electric transmission; using electromagnetic transmission
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/04Detecting movement of traffic to be counted or controlled using optical or ultrasonic detectors
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/09Arrangements for giving variable traffic instructions
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/09Arrangements for giving variable traffic instructions
    • G08G1/0962Arrangements for giving variable traffic instructions having an indicator mounted inside the vehicle, e.g. giving voice messages
    • G08G1/0965Arrangements for giving variable traffic instructions having an indicator mounted inside the vehicle, e.g. giving voice messages responding to signals from another vehicle, e.g. emergency vehicle

Definitions

  • the invention relates to the field of automatic driving, and in particular to a method and a device for identifying an emergency vehicle.
  • autonomous driving is one of the application scenarios of current cognitive computing technology.
  • the mainstream Internet companies with machine learning and cognitive computing technologies are developing the learning and control technology of autonomous driving.
  • the mainstream automobile companies with their automotive technology are developing auto-driving hardware technologies.
  • Identification is an important part of autonomous vehicle technology. Identification, usually by collecting data of the surroundings of the vehicle through cameras, radars, sensors, etc., and understanding and determining the surrounding environment based on rules or learning-based methods, thereby predicting the actions of surrounding objects.
  • the main method of visual data processing is to determine the type of emergency vehicles according to whether the warning lights are enabled. And active state.
  • the emergency vehicle in the flashing light is not necessarily active, that is, it is not necessarily performing the task, so in this case, the identification of the emergency vehicle will be wrong.
  • Embodiments of the present invention provide a method and apparatus for identifying an emergency vehicle to achieve accurate identification of an emergency vehicle.
  • an embodiment of the present invention provides a method for identifying an emergency vehicle, the method comprising the steps of:
  • the type and status of the emergency vehicle is determined based on the auditory category encoding of the sound signal and/or the visual category encoding of the video image.
  • the sound signal is subjected to an analysis process to obtain an auditory category code of the sound signal, which specifically includes:
  • a sound sequence of the sound signal is detected and identified to obtain an auditory category code of the sound signal.
  • the sound sequence of the sound signal is detected and identified, and the auditory category code of the sound signal is obtained, which specifically includes:
  • the voice recognition model detects and identifies a sound sequence of the sound signal according to the type identification parameter, An auditory category encoding of the sound signal is obtained.
  • the training the voice recognition model specifically includes:
  • the voice recognition model is trained by using a sound signal in a first classification information dictionary;
  • the first classification information dictionary includes at least a correspondence relationship between a sound feature of an emergency vehicle warning flute and an emergency vehicle type, and an emergency vehicle warning flute Correspondence information between the sound and the state of the emergency vehicle.
  • the method before the training the voice recognition model by using the sound signal in the first classification information dictionary, the method further includes:
  • the sound characteristics of the emergency vehicle warning flute include a sound frequency range and a pitch adjustment period.
  • the video image is subjected to an analysis process to obtain a visual category code of the video image, which specifically includes:
  • a visual category code of the video image is generated based on a color and a blinking pattern of the warning light.
  • the method before the video image is subjected to analysis processing to obtain visual category coding of the video image, the method further includes:
  • the illuminating feature of the warning light includes a color of the warning light and a blinking mode of the warning light.
  • the determining the type and status of the emergency vehicle according to the auditory category encoding of the sound signal and/or the visual category encoding of the video image includes:
  • the access index includes the auditory category code, searching, according to the access index, a type and a state of the emergency vehicle corresponding to the auditory category code in the first category information dictionary;
  • the access index includes the visual category code, searching, in the second classification information dictionary, a type and a state of the emergency vehicle corresponding to the visual category code according to the access index;
  • the access index includes the auditory category code and the visual category code
  • the visual category code corresponds to the type and status of the emergency vehicle.
  • an embodiment of the present invention provides an emergency vehicle identification device, which is used to perform the emergency vehicle identification method provided by the embodiment of the present invention, and the device includes:
  • a sound sensor for obtaining an acoustic signal of an environment in which the emergency vehicle is located
  • An optical sensor for obtaining a video image of an environment in which the emergency vehicle is located
  • a processor configured to perform an analysis process on the sound signal to obtain an auditory class code of the sound signal, And performing an analysis process on the video image to obtain a visual category code of the video image; and determining a type and a state of the emergency vehicle according to the auditory class code and/or the visual category code;
  • a memory for storing type and status information of the emergency vehicle.
  • the processor is specifically configured to:
  • a sound sequence of the sound signal is detected and identified to obtain an auditory category code of the sound signal.
  • the processor is further configured to:
  • the voice recognition model detects and recognizes a sound sequence of the sound signal according to the type identification parameter, and obtains an auditory category code of the sound signal.
  • the processor reads the sound signal in the first classification information dictionary to train the voice recognition model; the first classified information dictionary includes at least the sound characteristics of the emergency vehicle warning flute The correspondence relationship information with the type of the emergency vehicle, the correspondence relationship between the sound of the emergency vehicle warning flute and the state of the emergency vehicle.
  • the processor before the training of the voice recognition model, the processor according to the relationship between the sound characteristics of the emergency vehicle warning flute and the type of the emergency vehicle, the sound of the emergency vehicle warning flute and the emergency vehicle Corresponding relationship of the state, constructing the first classification information dictionary, and writing the first classification information dictionary into the memory; the sound characteristics of the emergency vehicle warning flute include a sound frequency range and a pitch adjustment period.
  • the processor is further configured to:
  • a visual category code of the video image is generated based on a color and a blinking pattern of the warning light.
  • the processor before the analyzing and processing the video image, the processor according to the corresponding relationship between the lighting characteristics of the emergency vehicle warning light and the type of the emergency vehicle, the blinking mode of the warning light, and the emergency vehicle Corresponding relationship of the state, constructing the second classification information dictionary, and writing the second classification information dictionary into the memory; the illumination characteristics of the warning light include a color of the warning light and a blinking mode of the warning light.
  • the processor is further configured to:
  • the processor is further configured to:
  • the access index includes the auditory category code, searching, according to the access index, a type and a state of the emergency vehicle corresponding to the auditory category code in the first category information dictionary;
  • the access index includes the visual category code, searching, in the second classification information dictionary, a type and a state of the emergency vehicle corresponding to the visual category code according to the access index;
  • the access index includes the auditory category code and the visual category code
  • Finding in the first classification information dictionary and the second classification information dictionary, the type and state of the emergency vehicle that corresponds to the auditory category code and the visual category code are matched.
  • the embodiment of the present invention further provides an emergency vehicle identification device, which is used to perform the emergency vehicle identification method provided by the embodiment of the present invention, and the device includes:
  • a data receiving unit configured to acquire sound signals and video images of an environment in which the emergency vehicle is located
  • a multi-modality sensing unit configured to perform an analysis process on the sound signal, obtain an auditory class code of the sound signal, and perform analysis processing on the video image to obtain a visual category code of the video image;
  • a matching unit configured to determine a type and a state of the emergency vehicle according to an auditory category encoding of the sound signal and/or a visual category encoding of the video image.
  • the multi-modality sensing unit includes: an auditory processing module; the auditory processing module is specifically configured to:
  • a sound sequence of the sound signal is detected and identified to obtain an auditory category code of the sound signal.
  • the device further includes:
  • the auditory processing module is further configured to:
  • the voice recognition model detects and recognizes a sound sequence of the sound signal according to the type identification parameter, and obtains an auditory category code of the sound signal.
  • the auditory processing module reads the sound signal in the first classification information dictionary to train the voice recognition model; the first classified information dictionary includes at least the sound of the emergency vehicle warning flute The correspondence relationship between the feature and the type of the emergency vehicle, the correspondence relationship between the sound of the emergency vehicle warning flute and the state of the emergency vehicle.
  • the auditory processing module before the training the voice recognition model, the auditory processing module according to the corresponding relationship between the sound characteristics of the emergency vehicle warning flute and the type of the emergency vehicle, the sound and emergency of the emergency vehicle warning flute Corresponding relationship of the state of the vehicle, constructing the first classification information dictionary, and storing the first classification information dictionary; the sound characteristics of the emergency vehicle warning flute include a sound frequency range and a pitch adjustment period.
  • the multi-modality sensing unit further includes a visual processing module; the visual processing module is configured to:
  • a visual category code of the video image is generated based on a color and a blinking pattern of the warning light.
  • the visual processing module before the analyzing and processing the video image, the visual processing module according to the corresponding relationship between the lighting characteristics of the emergency vehicle warning light and the type of the emergency vehicle, the blinking mode of the warning light, and the emergency vehicle Corresponding relationship of the state, constructing the second classification information dictionary, and storing the second classification information dictionary; the illumination feature of the warning light includes a color of the warning light and a blinking mode of the warning light.
  • the multi-modality sensing unit further includes an access index generating module
  • the access index generating module is configured to generate an access index according to the auditory category encoding and/or the visual category encoding;
  • the matching unit matches the type and status of the emergency vehicle from the first classification information dictionary and/or the second classification information dictionary according to the access index.
  • the matching unit searches, in the first category information dictionary, the type and state of the emergency vehicle corresponding to the auditory category code according to the access index;
  • the matching unit searches, in the second category information dictionary, the type and state of the emergency vehicle corresponding to the visual category code according to the access index;
  • the matching unit searches for the matching and the auditory category in the first classification information dictionary and the second classification information dictionary according to the access index.
  • the type and state of the emergency vehicle that the code and the visual category code collectively correspond to.
  • an embodiment of the present invention further provides a vehicle including a plurality of emergency vehicle identification devices provided by an embodiment of the present invention, wherein the emergency vehicle identification device is configured to identify an emergency vehicle type and state.
  • An embodiment of the present invention provides an emergency vehicle identification method, which acquires multi-modal information, that is, a sound signal and video image information in an environment in which an emergency vehicle is located, and separately performs the sound signal and the video image information.
  • Analytical processing results in auditory category coding and visual category coding of the sound signal.
  • the auditory category coding, the visual category coding or the combination of the auditory category coding and the visual category coding, the emergency vehicle type and state specified in the existing standard and the warning flute sound frequency, the pitch change period, the warning light color, the warning light blinking mode The standard correspondence information is used to find and match the type and active state of the emergency vehicle to be identified, thereby realizing accurate identification of the emergency vehicle in the following scenarios:
  • the detection and identification of the emergency vehicles can be completed independently by only the auditory category coding
  • the detection and identification of emergency vehicles can be completed independently by visual category coding
  • the emergency vehicle can be detected and identified to improve identification. The accuracy.
  • FIG. 1 is a schematic structural diagram of an identification device for an emergency vehicle according to an embodiment of the present invention
  • FIG. 2 is a flowchart of a method for identifying an emergency vehicle according to an embodiment of the present invention
  • FIG. 3 is a schematic structural diagram of an identification device for an emergency vehicle according to an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of framing a sound signal according to an embodiment of the present invention.
  • FIG. 5 is an observable sound sequence obtained by extracting an acoustic feature from the framed sound signal shown in FIG. 2 according to an embodiment of the present invention
  • FIG. 6 is a schematic diagram of a process of recognizing the sound sequence shown in FIG. 3 by the voice recognition model provided by the embodiment of the present invention
  • FIG. 7 is a schematic diagram of image semantic segmentation of a video frame of an emergency vehicle according to an embodiment of the present invention.
  • FIG. 8 is a schematic diagram of video semantic analysis of an emergency vehicle warning light area according to an embodiment of the present invention.
  • FIG. 1 is a schematic structural diagram of an identification device for an emergency vehicle according to an embodiment of the present invention. As shown in FIG. 1, the device includes an acoustic sensor 101, an optical sensor 102, a processor 103, and a memory 104.
  • the sound sensor 101 is used to acquire a sound signal of the environment in which the emergency vehicle is located.
  • the optical sensor 102 is used to acquire a video image of the environment in which the emergency vehicle is located.
  • the optical sensor 102 can be a camera or a laser radar-based optical scanning device.
  • the processor 103 is configured to perform an analysis process on the sound signal, obtain an auditory category code of the sound signal, and perform analysis processing on the video image to obtain a visual category code of the video image; and according to the auditory category
  • the code and the visual category code are matched from the memory 104 to obtain the type and status of the emergency vehicle.
  • the memory 104 is used to store the type of emergency vehicle and all possible status information.
  • FIG. 2 is a flowchart of a method for identifying an emergency vehicle according to an embodiment of the present invention.
  • the method for identifying an emergency vehicle according to an embodiment of the present invention is an apparatus for identifying an emergency vehicle, which may be the apparatus shown in FIG. 1.
  • the following is a specific implementation process of the method embodiment:
  • Step S200 Acquire a sound signal and a video image of an environment in which the emergency vehicle is located.
  • the warning flute and the warning light are generally turned on.
  • the sirens of different emergency vehicles such as police cars, ambulances, engineering repair vehicles, fire engines, etc.
  • the light will also illuminate in the corresponding color and flashing mode.
  • emergency vehicles due to environmental factors, emergency vehicles are mixed with various sounds in the environment during driving. Therefore, in the automatic driving technology, in order to identify emergency vehicles around the self-driving vehicle, it is necessary to acquire sound signals in the surrounding environment. And video images.
  • the analysis process of the acquired sound signal and the video image for subsequent steps identifies the emergency vehicle and the type and active state of the emergency vehicle, so that the self-driving vehicle can adjust the driving planning control.
  • Step S210 Perform analysis processing on the sound signal to obtain an auditory category code of the sound signal, and perform analysis processing on the video image to obtain a visual category code of the video image.
  • the analysis processing of the sound signal obtains the auditory category code of the sound signal, and specifically includes:
  • Step S2101 Perform framing processing on the sound signal.
  • Step S2102 Perform acoustic feature extraction on the framed processed sound signal to obtain a sound sequence of the sound signal.
  • Step S2103 detecting and recognizing the sound sequence of the sound signal to obtain an auditory category code of the sound signal. Specifically include:
  • a first classification information dictionary is constructed, and the first classification information dictionary is stored.
  • the first classification information dictionary may be constructed according to the correspondence between the sound characteristics of the emergency vehicle warning flute and the type of the emergency vehicle, the correspondence between the sound of the emergency vehicle warning flute and the state of the emergency vehicle, that is, the first classified information dictionary
  • At least the correspondence relationship between the sound characteristics of the emergency vehicle warning flute and the type of the emergency vehicle, the correspondence relationship between the sound of the emergency vehicle warning flute and the state of the emergency vehicle, and the sound characteristics of the emergency vehicle warning flute include the sound frequency range and the tone change cycle.
  • the emergency vehicle warning flute sound frequency range and variable season The first classification information dictionary is constructed with the standard correspondence information of the emergency vehicle type.
  • the sound emitted by the warning flute of each type of emergency vehicle has a corresponding frequency range, and the sounds in the frequency range have corresponding adjustment periods, and the state of the emergency vehicle is divided into active and inactive, and the sound of the flute is sounded. It means that it is active. If the warning sound is not loud, it means that it is not active. Therefore, as long as the sound signal of the warning flute is detected, the emergency vehicle can be considered active.
  • the voice recognition model may be trained by using a sound signal defined in the first classification information dictionary to obtain a type identification parameter of the voice recognition model;
  • the sound recognition model detects and identifies the sound sequence of the sound signal according to the type identification parameter, and obtains the auditory category code of the sound signal.
  • the video image is subjected to an analysis process to obtain a visual category code of the video image, which specifically includes:
  • Step S2104 Construct a second classification information dictionary, and store the second classification information dictionary.
  • the second classification information dictionary may be constructed according to a correspondence between a lighting feature of the emergency vehicle warning light and an emergency vehicle type, and a corresponding relationship between the blinking mode and the state, wherein the lighting feature of the warning light includes a color of the warning light,
  • the blinking mode of the warning light that is, the second classified information dictionary includes at least the color of the emergency vehicle warning light, the correspondence between the blinking mode and the emergency vehicle type, and the corresponding relationship between the blinking mode and the state.
  • each type of emergency vehicle warning light has a corresponding color change and a blinking mode when the light is illuminated, and the state of the emergency vehicle can be divided into an active state and an inactive state, and the two states can adopt different flashes.
  • the mode is expressed.
  • Step S2105 detecting the video image to determine a video frame of the emergency vehicle in the video image
  • Step S2106 performing image semantic segmentation on the video frame to obtain a warning light area of the emergency vehicle
  • Step S2107 performing video semantic analysis on the warning light area of the emergency vehicle to obtain a color and a blinking mode of the warning light;
  • Step S2108 Generate a visual category code of the video image according to the color and the blinking mode of the warning light.
  • Step S220 determining the type and state of the emergency vehicle according to the auditory category encoding of the sound signal and/or the visual category encoding of the video image.
  • an access index is generated according to the auditory category encoding and/or the visual category encoding.
  • the access index is composed only of the auditory category code, searching for, according to the access index, a type and a state of the emergency vehicle corresponding to the auditory category code in the first classification information dictionary;
  • the access index is composed only by the visual category coding, according to the access index, searching for a type and state of the emergency vehicle corresponding to the visual category code in the second classification information dictionary;
  • the access index is When the auditory category encoding and the visual category encoding are configured, searching for the emergency vehicle corresponding to the auditory category encoding and the visual category encoding in the first classification information dictionary and the second classification information dictionary according to the access index Type and status.
  • Searching for the type and state of the emergency vehicle corresponding to the auditory category code in the first category information dictionary according to the auditory category code, or searching for the match and the location in the second category information dictionary according to the visual category code Determining the type and state of the emergency vehicle corresponding to the visual category code, or combining the auditory category coding and the visual category coding, searching for the matching in the first classification information dictionary and the second classification information dictionary together with the auditory category coding and the visual category coding Corresponding type and status of emergency vehicles.
  • the auditory category coding and the visual category coding obtained by analyzing and processing the sound signal and the video image in the environment in which the emergency vehicle is located may be respectively encoded according to the auditory category or the visual category. Identify the type and status of the emergency vehicle, or combine the auditory category code or visual category code to identify the type and status of the emergency vehicle to further improve the accuracy of the identification.
  • FIG. 3 is a schematic structural diagram of another emergency vehicle identification device according to an embodiment of the present invention.
  • the device is a device that can identify an emergency vehicle and can be used to perform the method as shown in FIG. 2 provided by this embodiment.
  • the apparatus includes: a data receiving unit 301, a multi-modal sensing unit 302, a matching unit 303, and a storage unit 304.
  • the data receiving unit 301 includes a sound signal receiving module 3011 for collecting sound signals of the periphery of the self-driving car, and a video image receiving module 3012 for collecting video images of the periphery of the self-driving car.
  • the multimodal sensing unit 302 includes an auditory processing module 3021, a visual processing module 3022, and an access index generating module 3023.
  • the auditory processing module 3021 performs frame-by-frame processing on the collected sound signal, and performs acoustic feature extraction on the frame-processed sound signal, and then recognizes the acoustic characteristics of the sound signal by establishing an acoustic recognition model to obtain a sound.
  • the auditory category of the signal is encoded.
  • the visual processing module 3022 detects the captured video image, confirms the video frame of the emergency vehicle in the video image, and performs image semantic segmentation on the video frame to obtain a video stream of the emergency vehicle warning light area, and then The video stream of the warning light area performs video semantic analysis to obtain a color and a blinking pattern of the warning light, thereby obtaining a visual category code of the emergency vehicle.
  • the access index generating module 3023 combines the auditory category encoding and the visual category encoding to generate an access index.
  • the audit processing module 3021 and the visual processing module 3022 may be two different processors (or hardware function devices), or may be two dedicated ones in the same processor.
  • the matching unit 303 accesses the storage unit 304 according to the access index, and matches the type and status of the emergency vehicle from the storage unit 304.
  • the storage unit 304 stores a first classification information dictionary and a second classification information dictionary including an emergency vehicle type and all possible states.
  • the first classified information dictionary is a correspondence relationship information that defines an emergency vehicle type and a state according to a sound frequency of the emergency vehicle warning flute and a variable frequency period.
  • the second classified information dictionary defines correspondence information of the emergency vehicle type and the state according to the warning light color of the emergency vehicle and the warning light blinking mode.
  • the specific implementation process of the audit processing module 3021 is:
  • the auditory processing module 3021 performs a framing process on the received sound signal according to a desired frame length and frame shift using a window function, that is, dividing the received sound signal into overlapping multiples.
  • Frames For example, the frame length is 25 ms and the frame is shifted by 15 ms for segmentation.
  • the window function used in the embodiment of the present invention refers to a truncation function for truncating the sound signal
  • the frame shift refers to the overlap amount of the adjacent two frames of data, that is, the overlap of the head of the previous frame and the head of the subsequent frame. Since the frame is shifted, each frame signal has the components of the previous frame and the next frame, so that the discontinuity between the two frames can be prevented, so that each frame of data is related data, which can be better with The actual sound is close.
  • the auditory processing module 3021 performs acoustic feature extraction on the framed sound signal by extracting Mel Frequency Cepstral Coefficents (MFCC). Specifically, the auditory processing module 3021 extracts MFCC features of M points in each frame waveform of the sound signal according to physiological characteristics of the human ear, and transforms into an M-dimensional vector. Wherein, the M-dimensional vector of each frame of the sound signal includes the frame sound signal Content information of the number. The M-dimensional vectors of each frame of the sound signal are combined into a matrix of M rows and N columns to obtain an observable sound sequence for characterizing the sound signal, where N is the total number of frames.
  • MFCC Mel Frequency Cepstral Coefficents
  • M is equal to 12, and an observable sound sequence of 12 rows*N columns as shown in FIG. 5 is obtained, wherein each frame is represented by a 12-dimensional vector, and the color shade of the color block represents a vector value. the size of.
  • the specific process of extracting the MFCC feature for each frame of the sound signal is: obtaining a corresponding spectrum by using each short-term analysis window and a Fast Fourier Transform (FFT); and passing the obtained spectrum through Mel.
  • the filter bank obtains the Mel spectrum; performs cepstrum analysis on the Mel spectrum, takes the inverse transform by logarithm and discrete cosine transform (DCT), and takes the second to thirteenth coefficients after DCT as
  • the MFCC coefficient which obtains the Mel frequency cepstral coefficient MFCC, is the characteristic of the frame sound.
  • the auditory processing module 3021 extracts acoustic characteristics from the sound signal in addition to extracting Mel Frequency Cepstral Coefficents (MFCC).
  • MFCC Mel Frequency Cepstral Coefficents
  • the auditory processing module 3021 may construct a voice recognition model based on a hidden Markov model (HMM), and identify the sound sequence by the voice recognition model to obtain category information of the sound sequence. And outputting the category information to obtain an auditory category encoding of the sound sequence.
  • the voice recognition model may be trained by using a sound signal in the first classification information dictionary to obtain a type identification parameter of the voice recognition model.
  • the voice recognition model identifies the sound sequence extracted by the MFCC feature according to the type identification parameter, and obtains the auditory category code of the sound sequence, the type of the sound signal referred to by the auditory category, and the category of the sound signal. It can be distinguished according to the sound frequency and tone change period that conform to standards or conventions.
  • the sound signal in the first classified information dictionary is a large number of marked road environment sound signals, that is, the sound characteristics (ie, sound frequency and pitching period) according to the Chinese national standard GB8108-1999 may be used to alert the flute.
  • the sound is defined by the standard correspondence data. For example, as shown in Table 1, the sound frequency is The sound signal between the frequency change period of 0.333-0.385 is the emergency tone adjustment, and the corresponding model is the police car.
  • the auditory processing module 3021 provided by the embodiment of the present invention may be used for detecting and recognizing other objects, as long as the one-to-one correspondence between the sound features emitted by the object and the object is stored as a first classified information dictionary. Just fine.
  • the visual processing module 3022 may detect whether there is an emergency vehicle in the video image by using a multi-target detection algorithm to obtain a video frame of the emergency vehicle.
  • the multi-target detection algorithm may be: a real-time multi-target detection algorithm SSD, YOLO, etc., and an R-CNN, Fast R-CNN or Faster R-CNN algorithm.
  • the visual processing module 3022 may use a convolutional neural network (CNN) combined with a deep convolutional neural network (DCNN) algorithm to perform image semantic segmentation on the video frame, and set a warning light.
  • CNN convolutional neural network
  • DCNN deep convolutional neural network
  • the area is cut out. Specifically, the feature of each pixel in the image is extracted first by using the CNN, and then the DCNN is used to restore the position of each pixel in the original image to realize pixel-level recognition and output the segmentation of the warning light region.
  • the visual processing module 3022 performs video semantic analysis on the video stream of the segmented warning light area, detects and recognizes the color and the blinking mode of the warning light, and obtains a visual category of the video image. coding.
  • the CNN combined with a Recurrent Neural Network (RNN) algorithm and a Long Short-Term Memory (LSTM) algorithm can be used to segment the sense of each frame in the multi-frame video stream.
  • RNN Recurrent Neural Network
  • LSTM Long Short-Term Memory
  • ROI which is a warning light area, uses CNN to extract features of the ROI region in each frame of the image, including different color patches and shapes of the warning light in the current frame; for the extracted features, a double layer is used.
  • the RNN network implemented by the LSTM extracts the timing relationship of each frame in the video, outputs the color change characteristic of the warning light, and obtains the feature of whether the warning light is blinking or the color of the color block, and outputs the visual category code.
  • the visual category refers to a category that is distinguished by a warning light color and color change (ie, a blinking pattern) that conforms to standards or conventions.
  • the visual category coding is the coding of the visual category.
  • the visual processing module 3022 limits the area of the video semantic analysis to the limited range of the police light for target video detection and ROI region segmentation, reduces the calculation amount and improves the recognition speed, and is applicable. For self-driving cars with weak computing power.
  • the process of processing and recognizing the video images by the visual processing module 3022 provided by the embodiment of the present invention can independently complete the detection and identification of the active emergency vehicles. .
  • the visual processing module 3022 provided by the embodiment of the present invention may be used for detecting and recognizing other objects, as long as the one-to-one correspondence between the visual features of the object and the object is stored as a second classification information dictionary. .
  • the access index generating module 3023 generates an access index according to the auditory category encoding and/or the visual category encoding.
  • the matching unit 303 searches for the matching and the matching in the first classified information dictionary according to the access index, according to the access index, when the access index is composed only by the auditory category encoding.
  • the auditory category encodes a type and status of the corresponding emergency vehicle; when the access index is only encoded by the visual category
  • the matching unit 303 searches, in the second classification information dictionary, the type and state of the emergency vehicle corresponding to the visual category code according to the access index; when the access index is encoded by the auditory category And the visual category coding is configured, the matching unit 303 searches, in the first classification information dictionary and the second classification information dictionary, the emergency corresponding to the auditory category coding and the visual category coding according to the access index.
  • the type and status of the vehicle when the access index is composed only by the auditory category encoding.
  • the auditory category encodes a type and status of the corresponding emergency vehicle; when the access index is only encoded by the visual category
  • the matching unit 303 searches, in the second classification information dictionary, the type
  • the indexing manner when the matching unit 303 accesses the storage unit 304 may be determined according to the storage manner of the emergency vehicle type and state by the storage unit 304, for example:
  • the storage unit 304 may employ an enumeration manner to pre-store the vehicle type and all possible states thereof.
  • the vehicle type and status data of the emergency vehicle can be stored by means of a two-dimensional table.
  • Table 2 is a table of N rows and 2 columns. Each row in the table corresponds to an emergency vehicle and its status. One column indicates the type of emergency vehicle and the other column indicates the state of the emergency vehicle.
  • Table 2 A list of storage for emergency vehicles
  • the access index generating module 3023 combines the visual category encoding and the auditory category encoding into an index encoding, and accesses the storage unit 304 to determine the type and state of the emergency vehicle. Specifically, assuming that the N-type visual category information and the M-type auditory category information are shared, the visual category encoding and the auditory category encoding are log2N bits and log2M bits, respectively, and the index of the access storage unit is shared (log2N+log2M) bits.
  • the storage unit 304 may also store the vehicle type and status data of the emergency vehicle in a two-dimensional manner, and each item in the table is represented by a two-dimensional vector, two of the vectors.
  • the components are the type of emergency vehicle and its status.
  • the access index generating module 3023 accesses the storage unit 304 by using the visual category encoding and the auditory category encoding as the row index and the column index, respectively, thereby determining the type of the emergency vehicle and status. Specifically, assuming that the N-type visual category information and the M-type auditory category information are shared, the visual and auditory category codes are log2N bits and log2M bits, respectively.
  • the two-dimensional table storage manner as shown in Table 2 and Table 3 adopted by the storage unit 304 facilitates the expansion of the type and state of the emergency vehicle, and can be extended by the storage unit content. achieve.
  • the identification device of the emergency vehicle in the above-mentioned FIG. 1 and FIG. 3 provided by the embodiment of the present invention can be applied to an autonomous driving vehicle and other traffic devices for timely identifying the surrounding emergency vehicles, and can also be applied to identify by sound and video. Moving the object facilitates timely avoidance of the emergency vehicle and the other objects.
  • the emergency vehicle identification device provided by the embodiment of the present invention, the processing process of the plurality of modal data can be combined with the audit processing module 3021 and the visual processing module 3022 to efficiently and accurately identify the type of the emergency vehicle and the active state or the inactive state, and support the planning control of the automatic driving.

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Electromagnetism (AREA)
  • Business, Economics & Management (AREA)
  • Emergency Management (AREA)
  • Traffic Control Systems (AREA)
  • Image Analysis (AREA)
  • Alarm Systems (AREA)

Abstract

一种应急车辆的识别方法及装置,该方法包括:获取应急车辆所处环境的声音信号和视频图像;将所述声音信号进行分析处理得到听觉编码,并将所述视频图像进行分析处理得到视觉编码;根据所述听觉编码和视觉编码确定所述应急车辆的类型和状态。该方法通过所述听觉编码、视觉编码,或者结合所述听觉编码和视觉编码,在现存标准规定的应急车辆类型及状态与警示笛声音频率、变调周期、警示灯颜色、警示灯闪烁模式的标准对应关系信息中查找匹配要识别的应急车辆的类型和活跃状态,可实现在多种场景下对应急车辆的精准识别。

Description

一种应急车辆的识别方法及装置 技术领域
本发明涉及自动驾驶领域,尤其涉及一种应急车辆的识别方法及装置。
背景技术
随着互联网的发展自动驾驶是当前认知计算技术的应用场景之一。目前,以机器学习和认知计算技术为特长的业界主流互联网公司都在发展自动驾驶的学习和控制技术,以汽车技术为特长的业界主流车企都在发展自动驾驶的硬件技术。
自动驾驶的技术现状总体发展较好,目前自动驾驶在总行驶里程中,失效率小于1%,仍做不到在熟悉场景中的完全自动驾驶。自动驾驶失效的主要原因之一是识别错误。识别是自动驾驶汽车技术的重要部分。识别,通常为通过摄像机、雷达、传感器等采集车辆周围环境的数据,并基于规则或基于学习的方法来理解和确定周围环境,进而预测周围物体的行动。
目前,在对应急车辆(例如正在执行任务的警车、救护车、消防车和工程抢险车等)的识别技术中,主要使用视觉数据处理的方式,即根据警示灯是否启用来判断应急车辆的类型和活跃状态。但是,在很多情况下,在闪灯的应急车辆,并不一定是活跃的,即不一定在执行任务,因而在此种情况下,对应急车辆的识别将会出现错误。
发明内容
本发明实施例提供了一种应急车辆的识别方法及装置,以实现对应急车辆的准确识别。
一方面,本发明实施例提供一种应急车辆的识别方法,该方法包括步骤:
获取应急车辆所处环境的声音信号和视频图像;
将所述声音信号进行分析处理,得到所述声音信号的听觉类别编码,以及将所述视频图像进行分析处理,得到所述视频图像的视觉类别编码;
根据所述声音信号的听觉类别编码和/或所述视频图像的视觉类别编码,确定所述应急车辆的类型和状态。
在一种可能的实现方式中,所述将所述声音信号进行分析处理,得到所述声音信号的听觉类别编码,具体包括:
将所述声音信号进行分帧处理;
将所述经分帧处理后的声音信号进行声学特征提取,得到所述声音信号的声音序列;
对所述声音信号的声音序列进行检测识别,得到所述声音信号的听觉类别编码。
在一种可能的实现方式中,对所述声音信号的声音序列进行检测识别,得到所述声音信号的听觉类别编码,具体包括:
构建声音识别模型;
对所述声音识别模型进行训练,得到所述声音识别模型的类型识别参数;
所述声音识别模型根据所述类型识别参数,对所述声音信号的声音序列进行检测识别, 得到所述声音信号的听觉类别编码。
在一种可能的实现方式中,所述对所述声音识别模型进行训练,具体包括:
采用第一分类信息字典中的的声音信号对所述声音识别模型进行训练;所述第一分类信息字典至少包含应急车辆警示笛的声音特征与应急车辆的类型的对应关系信息、应急车辆警示笛的音响与应急车辆的状态的对应关系信息。
在一种可能的实现方式中,所述在采用第一分类信息字典中的的声音信号对所述声音识别模型进行训练之前,所述方法还包括:
根据应急车辆警示笛的声音特征与应急车辆的类型的对应关系、应急车辆警示笛的音响与应急车辆的状态的对应关系,构建所述第一分类信息字典,以及将所述第一分类信息字典存储;所述应急车辆警示笛的声音特征包括声音频率范围、变调周期。
在一种可能的实现方式中,所述将所述视频图像进行分析处理,得到所述视频图像的视觉类别编码,具体包括:
对所述视频图像进行检测,得到所述应急车辆的视频帧;
将所述视频帧进行图像语义分割,得到所述应急车辆的警示灯区域;
对所述应急车辆的警示灯区域进行视频语义分析,得到所述警示灯的颜色和闪烁模式;
根据所述警示灯的颜色和闪烁模式,生成所述视频图像的视觉类别编码。
在一种可能的实现方式中,在将所述视频图像进行分析处理,得到所述视频图像的视觉类别编码之前,所述方法还包括:
根据应急车辆警示灯的发光特征与应急车辆类型的对应关系、警示灯的闪烁模式与应急车辆的状态的对应关系,构建所述第二分类信息字典,以及将所述第二分类信息字典存储;所述警示灯的发光特征包括警示灯的颜色、警示灯的闪烁模式。
在一种可能的实现方式中,所述根据所述声音信号的听觉类别编码和/或所述视频图像的视觉类别编码,确定所述应急车辆的类型和状态,具体包括:
根据所述听觉类别编码和/或所述视觉类别编码,生成访问索引;
根据所述访问索引,从所述第一分类信息字典和/或第二分类信息字典中匹配出所述应急车辆的类型和状态。
在一种可能的实现方式中,
当所述访问索引包含所述听觉类别编码时,根据所述访问索引,在所述第一分类信息字典中查找匹配出与所述听觉类别编码对应的应急车辆的类型和状态;
当所述访问索引包含所述视觉类别编码时,根据所述访问索引,在所述第二分类信息字典中查找匹配出与所述视觉类别编码对应的应急车辆的类型和状态;
当所述访问索引包含所述听觉类别编码和所述视觉类别编码时,根据所述访问索引,在第一分类信息字典和第二分类信息字典中查找匹配出与所述听觉类别编码和所述视觉类别编码共同对应的应急车辆的类型和状态。
另一方面,本发明实施例提供一种应急车辆的识别装置,用于执行本发明实施例提供的应急车辆识别方法,该装置包括:
声音传感器,用于获取应急车辆所处环境的声音信号;
光学传感器,用于获取应急车辆所处环境的视频图像;
处理器,用于将所述声音信号进行分析处理,得到所述声音信号的听觉类别编码, 以及将所述视频图像进行分析处理,得到所述视频图像的视觉类别编码;以及根据所述听觉类别编码和/或所述视觉类别编码,确定所述应急车辆的类型和状态;
存储器,用于存储应急车辆的类型和状态信息。
在一种可能的实现方式中,所述处理器具体用于,
将所述声音信号进行分帧处理;
将所述经分帧处理后的声音信号进行声学特征提取,得到所述声音信号的声音序列;
对所述声音信号的声音序列进行检测识别,得到所述声音信号的听觉类别编码。
在一种可能的实现方式中,所述处理器具体还用于,
构建声音识别模型;
对所述声音识别模型进行训练,得到所述声音识别模型的类型识别参数;
所述声音识别模型根据所述类型识别参数,对所述声音信号的声音序列进行检测识别,得到所述声音信号的听觉类别编码。
在一种可能的实现方式中,所述处理器读取第一分类信息字典中的的声音信号对所述声音识别模型进行训练;所述第一分类信息字典至少包含应急车辆警示笛的声音特征与应急车辆的类型的对应关系信息、应急车辆警示笛的音响与应急车辆的状态的对应关系信息。
在一种可能的实现方式中,在对所述声音识别模型进行训练之前,所述处理器根据应急车辆警示笛的声音特征与应急车辆的类型的对应关系、应急车辆警示笛的音响与应急车辆的状态的对应关系,构建所述第一分类信息字典,以及将所述第一分类信息字典写入所述存储器;所述应急车辆警示笛的声音特征包括声音频率范围、变调周期。
在一种可能的实现方式中,所述处理器具体还用于,
对所述视频图像进行检测,得到所述应急车辆的视频帧;
将所述视频帧进行图像语义分割,得到所述应急车辆的警示灯区域;
对所述应急车辆的警示灯区域进行视频语义分析,得到所述警示灯的颜色和闪烁模式;
根据所述警示灯的颜色和闪烁模式,生成所述视频图像的视觉类别编码。
在一种可能的实现方式中,在将所述视频图像进行分析处理之前,所述处理器根据应急车辆警示灯的发光特征与应急车辆的类型的对应关系、警示灯的闪烁模式与应急车辆的状态的对应关系,构建所述第二分类信息字典,以及将所述第二分类信息字典写入所述存储器;所述警示灯的发光特征包括警示灯的颜色、警示灯的闪烁模式。
在一种可能的实现方式中,所述处理器具体还用于,
根据所述听觉类别编码和/或所述视觉类别编码,生成访问索引;
根据所述访问索引,从所述第一分类信息字典和/或第二分类信息字典中匹配出所述应急车辆的类型和状态。
在一种可能的实现方式中,所述处理器具体还用于,
当所述访问索引包含所述听觉类别编码时,根据所述访问索引,在所述第一分类信息字典中查找匹配出与所述听觉类别编码对应的应急车辆的类型和状态;
当所述访问索引包含所述视觉类别编码时,根据所述访问索引,在所述第二分类信息字典中查找匹配出与所述视觉类别编码对应的应急车辆的类型和状态;
当所述访问索引包含所述听觉类别编码和所述视觉类别编码时,根据所述访问索引, 在第一分类信息字典和第二分类信息字典中查找匹配出与所述听觉类别编码和所述视觉类别编码共同对应的应急车辆的类型和状态。
又一方面,本发明实施例还提供一种应急车辆的识别装置,用于执行本发明实施例提供的应急车辆识别方法,该装置包括:
数据接收单元,用于获取应急车辆所处环境的声音信号和视频图像;
多模态感知单元,用于将所述声音信号进行分析处理,得到所述声音信号的听觉类别编码,以及将所述视频图像进行分析处理,得到所述视频图像的视觉类别编码;
匹配单元,用于根据所述声音信号的听觉类别编码和/或所述视频图像的视觉类别编码,确定所述应急车辆的类型和状态。
在一种可能的实现方式中,所述多模态感知单元包括:听觉处理模块;所述听觉处理模块具体用于,
将所述声音信号进行分帧处理;
将所述经分帧处理后的声音信号进行声学特征提取,得到所述声音信号的声音序列;
对所述声音信号的声音序列进行检测识别,得到所述声音信号的听觉类别编码。
在一种可能的实现方式中,所述装置还包括:
在一种可能的实现方式中,所述听觉处理模块具体还用于,
构建声音识别模型;
对所述声音识别模型进行训练,得到所述声音识别模型的类型识别参数;
所述声音识别模型根据所述类型识别参数,对所述声音信号的声音序列进行检测识别,得到所述声音信号的听觉类别编码。
在一种可能的实现方式中,所述听觉处理模块读取第一分类信息字典中的的声音信号对所述声音识别模型进行训练;所述第一分类信息字典至少包含应急车辆警示笛的声音特征与应急车辆的类型的对应关系信息、应急车辆警示笛的音响与应急车辆的状态的对应关系信息。
在一种可能的实现方式中,在对所述声音识别模型进行训练之前,所述听觉处理模块根据应急车辆警示笛的声音特征与应急车辆的类型的对应关系、应急车辆警示笛的音响与应急车辆的状态的对应关系,构建所述第一分类信息字典,以及将所述第一分类信息字典存储;所述应急车辆警示笛的声音特征包括声音频率范围、变调周期。
在一种可能的实现方式中,所述多模态感知单元还包括视觉处理模块;所述视觉处理模块用于,
对所述视频图像进行检测,得到所述应急车辆的视频帧;
将所述视频帧进行图像语义分割,得到所述应急车辆的警示灯区域;
对所述应急车辆的警示灯区域进行视频语义分析,得到所述警示灯的颜色和闪烁模式;
根据所述警示灯的颜色和闪烁模式,生成所述视频图像的视觉类别编码。
在一种可能的实现方式中,在将所述视频图像进行分析处理之前,所述视觉处理模块根据应急车辆警示灯的发光特征与应急车辆的类型的对应关系、警示灯的闪烁模式与应急车辆的状态的对应关系,构建所述第二分类信息字典,以及将所述第二分类信息字典存储;所述警示灯的发光特征包括警示灯的颜色、警示灯的闪烁模式。
在一种可能的实现方式中,所述多模态感知单元还包括访问索引生成模块;
所述访问索引生成模块,用于根据所述听觉类别编码和/或所述视觉类别编码,生成访问索引;
所述匹配单元根据所述访问索引,从所述第一分类信息字典和/或第二分类信息字典中匹配出所述应急车辆的类型和状态。
在一种可能的实现方式中,
当所述访问索引包含所述听觉类别编码时,所述匹配单元根据所述访问索引,在所述第一分类信息字典中查找匹配出与所述听觉类别编码对应的应急车辆的类型和状态;
当所述访问索引包含所述视觉类别编码时,所述匹配单元根据所述访问索引,在所述第二分类信息字典中查找匹配出与所述视觉类别编码对应的应急车辆的类型和状态;
当所述访问索引包含所述听觉类别编码和所述视觉类别编码时,所述匹配单元根据所述访问索引,在第一分类信息字典和第二分类信息字典中查找匹配出与所述听觉类别编码和所述视觉类别编码共同对应的应急车辆的类型和状态。
再一方面,本发明实施例还提供一种车,该种车包括包括本发明实施例提供的各种急车辆识别装置,所述应急车辆识别装置用于识别应急车辆的类型及状态。
本发明实施例提供的一种应急车辆的识别方法,通过获取多模态信息,即应急车辆所处的环境中是的声音信号和视频图像信息,并分别对所述声音信号和视频图像信息进行分析处理,得到声音信号的听觉类别编码和视觉类别编码。通过所述听觉类别编码、视觉类别编码或者结合所述听觉类别编码和视觉类别编码,在现存标准规定的应急车辆类型及状态与警示笛声音频率、变调周期、警示灯颜色、警示灯闪烁模式的标准对应关系信息中查找、匹配出所要识别的应急车辆的类型和活跃状态,进而实现在以下场景中,对应急车辆的精准识别:
当不同应急车辆有不同警示笛声时,仅通过听觉类别编码,即可独立的完成应急车辆的检测和识别;
当不同应急车辆有不同警示灯颜色和闪烁模式时,仅通过视觉类别编码,即可独立的完成应急车辆的检测和识别;
当不同的应急车辆可能有相同的警示笛声,或者不同的应急车辆可能有相同的警示灯颜色、闪烁模式时,可结合听觉类别编码和视觉类别编码,对应急车辆进行检测和识别,提高识别的准确性。
附图说明
图1是本发明实施例提供的一种应急车辆的识别装置的结构示意图;
图2是本发明实施例提供的一种应急车辆的识别方法的流程图;
图3是本发明实施例提供的一种应急车辆的识别装置的结构示意图;
图4是本发明实施例对声音信号进行分帧的示意图;
图5是本发明实施例对图2所示的经分帧后的声音信号提取声学特征而得到的可观察声音序列;
图6是本发明实施例提供的声音识别模型对图3所示的声音序列的识别过程示意图;
图7是本发明实施例对应急车辆的视频帧进行图像语义分割的示意图;
图8是本发明实施例对应急车辆警示灯区域进行视频语义分析的示意图。
具体实施方式
下面通过附图和具体的实施例,对本发明实施例的技术方案做进一步的详细描述。
图1是本发明实施例提供的一种应急车辆的识别装置的结构示意图。如图1所示,本装置包括:声音传感器101、光学传感器102、处理器103、存储器104。
声音传感器101用于获取应急车辆所处环境的声音信号。
光学传感器102用于获取应急车辆所处环境的视频图像。
可选的,所述光学传感器102可以采用摄像机,或者基于激光雷达的光学扫描装置。
处理器103用于将所述声音信号进行分析处理,得到所述声音信号的听觉类别编码,以及将所述视频图像进行分析处理,得到所述视频图像的视觉类别编码;以及根据所述听觉类别编码和所述视觉类别编码,从存储器104中匹配得到所述应急车辆的类型和状态。
存储器104用于存储应急车辆的类型和所有可能的状态信息。
图2是本发明实施例提供的一种应急车辆的识别方法的流程图。如图2所示,本发明实施例提供的应急车辆的识别方法,其执行主体为可以识别应急车辆的装置,可以为图1所示的装置,以下是本方法实施例的具体实施过程:
步骤S200:获取应急车辆所处环境的声音信号和视频图像。
应急车辆在行驶过程中,一般都会开启警示笛和警示灯,不同的应急车辆(如警车、救护车、工程抢修车、消防车等)的警笛会按相应的频率和声调变化规律发出声音,警示灯也会按相应的颜色和闪烁模式发光。但由于环境因素,应急车辆在行驶过程中会掺杂着环境中各种各样的声音,因此,在自动驾驶技术中,为了识别自动驾驶车辆周边的应急车辆,需要获取周边环境中的声音信号和视频图像。以供后续步骤对所获取的声音信号和视频图像的分析处理,识别出应急车辆以及所述应急车辆的类型及活跃状态,便于自动驾驶车辆调整行驶规划控制。
步骤S210:将所述声音信号进行分析处理,得到所述声音信号的听觉类别编码,以及将所述视频图像进行分析处理,得到所述视频图像的视觉类别编码。
具体地,对所述声音信号的分析处理,得到所述声音信号的听觉类别编码,具体包括:
步骤S2101:将所述声音信号进行分帧处理。
步骤S2102:将所述经分帧处理后的声音信号进行声学特征提取,得到所述声音信号的声音序列。
步骤S2103:将所述声音信号的声音序列进行检测识别,得到所述声音信号的听觉类别编码。具体包括:
首先,构建第一分类信息字典,以及将所述第一分类信息字典存储。所述第一分类信息字典可以根据应急车辆警示笛的声音特征与应急车辆的类型的对应关系、应急车辆警示笛的音响与应急车辆的状态的对应关系构建,,即所述第一分类信息字典至少包含应急车辆警示笛的声音特征与应急车辆的类型的对应关系信息、应急车辆警示笛的音响与应急车辆的状态的对应关系信息,所述应急车辆警示笛的声音特征包括声音频率范围、变调周期。,例如,可以依据中国国家标准GB8108-1999规定的应急车辆警示笛声音频率范围、变调周 期与所述应急车辆类型的标准对应关系信息,构建第一分类信息字典。其中,每一种应急车辆的警示笛发出的声音都有相应的频率范围,以及该种频率范围内的声音都有相应的变调周期,而应急车辆的状态分为活跃和不活跃,警示笛声响则表示活跃,警示笛声不响则表示不活跃,因此只要检测到警示笛的声音信号,即可认为应急车辆活跃。
其次,构建声音识别模型;
再者,可采用第一分类信息字典中定义的声音信号对所述声音识别模型进行训练,得到所述声音识别模型的类型识别参数;
最后,通过所述声音识别模型根据所述类型识别参数,对所述声音信号的声音序列进行检测识别,得到得到所述声音信号的听觉类别编码。
具体地,对所述视频图像进行分析处理,得到所述视频图像的视觉类别编码,具体包括:
步骤S2104:构建第二分类信息字典,以及将所述第二分类信息字典存储。具体地,所述第二分类信息字典可根据应急车辆警示灯的发光特征与应急车辆类型的对应关系,以及闪烁模式与状态的对应关系构建,其中,警示灯的发光特征包括警示灯的颜色、警示灯的闪烁模式,即所述第二分类信息字典至少包含应急车辆警示灯的颜色、闪烁模式与应急车辆类型的对应关系,以及闪烁模式与状态的对应关系。其中,每一种应急车辆的警示灯在发光时,都有相应颜色变化和闪烁模式与之对应,而应急车辆的状态可分为活跃状态和不活跃状态,该两种状态可采用不同的闪烁模式来表示。
步骤S2105:对所述视频图像进行检测,确定视频图像中应急车辆的视频帧;
步骤S2106:将所述视频帧进行图像语义分割,得到所述应急车辆的警示灯区域;
步骤S2107:对所述应急车辆的警示灯区域进行视频语义分析,得到所述警示灯的颜色和闪烁模式;
步骤S2108:根据所述警示灯的颜色和闪烁模式,生成所述视频图像的视觉类别编码。
步骤S220:根据所述声音信号的听觉类别编码和/或所述视频图像的视觉类别编码,确定所述应急车辆的类型和状态。
具体地,根据所述所述听觉类别编码和/或所述视觉类别编码,生成访问索引。当所述访问索引仅由所述听觉类别编码构成时,则根据所述访问索引,在第一分类信息字典中查找匹配出与所述听觉类别编码对应的应急车辆的类型和状态;当所述访问索引仅由所述视觉类别编码构成时,根据所述访问索引,在第二分类信息字典中查找匹配出与所述视觉类别编码对应的应急车辆的类型和状态;当所述访问索引由所述听觉类别编码和所述视觉类别编码构成时,根据所述访问索引,在第一分类信息字典和第二分类信息字典中查找匹配出与所述听觉类别编码和视觉类别编码共同对应的应急车辆的类型和状态。
根据听觉类别编码,在第一分类信息字典中查找匹配出与所述听觉类别编码对应的应急车辆的类型和状态,或者根据所述视觉类别编码,在第二分类信息字典中查找匹配出与所述视觉类别编码对应的应急车辆的类型和状态,或者结合听觉类别编码和视觉类别编码,在第一分类信息字典和第二分类信息字典中查找匹配出与所述听觉类别编码和视觉类别编码共同对应的应急车辆的类型和状态。
本发明实施例中,分别通过对应急车辆所处的环境中的声音信号和视频图像进行分析处理,得到的听觉类别编码和视觉类别编码,可分别根据听觉类别编码或者视觉类别编码, 识别应急车辆的类型及状态,或者可以结合听觉类别编码或者视觉类别编码,识别应急车辆的类型及状态,进一步提高识别的准确率。
上述结合图2对本发明实施例提供的应急车辆的识别方法进行了详细的阐述,下面结合图3详细描述可以识别应急车辆的装置。
图3为本发明实施例提供的另一种应急车辆的识别装置的结构示意图。该装置为可识别应急车辆的装置,可用于执行本实施例提供的如图2所示的方法。如图3所示,该装置包括:数据接收单元301、多模态感知单元302、匹配单元303、存储单元304。
其中,数据接收单元301包括用于采集自动驾驶汽车周边声音信号的声音信号接收模块3011,和用于采集自动驾驶汽车周边视频图像的视频图像接收模块3012。
多模态感知单元302包括听觉处理模块3021、视觉处理模块3022、访问索引生成模块3023。听觉处理模块3021将采集到的声音信号进行分帧处理,并将经分帧处理后的声音信号进行声学特征提取,然后通过建立声学识别模型,对所述声音信号的声学特征进行识别,得到声音信号的听觉类别编码。视觉处理模块3022将采集到的视频图像进行检测,确认所述视频图像中应急车辆的视频帧,并将所述视频帧进行图像语义分割,得到所述应急车辆警示灯区域的视频流,然后对所述警示灯区域的视频流进行视频语义分析,得到警示灯的颜色和闪烁模式,进而得到所述应急车辆的视觉类别编码。访问索引生成模块3023结合所述听觉类别编码和所述视觉类别编码,生成访问索引。
可选地,在具体实现声音信号和视频图像时,听觉处理模块3021、视觉处理模块3022可以分别为两个不同的处理器(或硬件功能装置),或者可以为同一处理器中的两个专用处理模块,或者可以为一个处理器调用的两个不同程序模块,其中一个程序模块用于实现声音信号的处理,另一个一个程序模块用于实现视频图像的处理。
所述匹配单元303根据所述访问索引,访问存储单元304,从存储单元304匹配出应急车辆的类型和状态。
所述存储单元304中存储包括应急车辆类型及所有可能的状态在内的第一分类信息字典、第二分类信息字典。所述第一分类信息字典为根据应急车辆警示笛的声音频率、变调周期定义应急车辆类型及状态的对应关系信息。所述第二分类信息字典为根据应急车辆的警示灯颜色、警示灯闪烁模式定义应急车辆类型及状态的对应关系信息。
具体地,所述听觉处理模块3021的具体实施过程为:
如图4所示,所述听觉处理模块3021使用窗函数,按所需的帧长、帧移,将接收的声音信号进行分帧处理,即将所述接收的声音信号切分成有交叠的多个帧。例如:选择帧长为25ms、帧移15ms进行切分,帧移的计算为:25ms-10ms=15ms。
本发明实施例所采用的窗函数指的是对声音信号进行截断的截取函数,帧移指的是前后相邻两帧数据的重叠量,即前一帧尾部与后一帧头部的重叠量,由于帧移后,每一帧信号都有上一帧和下一帧的成分,因此可防止两帧之间的不连续,使每帧数据之间都是相关的数据,可以更好地与实际的声音相接近。
如图5所示,所述听觉处理模块3021采用提取梅尔频率倒谱系数(Mel Frequency Cepstral Coefficents,MFCC)的方式,将所述经分帧后的声音信号进行声学特征提取。具体地,所述听觉处理模块3021根据人耳的生理特性提取所述声音信号每一帧波形中M个点的MFCC特征,变换为一个M维向量。其中,每一帧声音信号的M维向量包含该帧声音信 号的内容信息。将每帧声音信号的M维向量组成一个M行、N列的矩阵,得到一个用于表征所述声音信号的可观察声音序列,其中N为总帧数。例如,本发明实施例取M等于12,得到如图5所示的12行*N列的可观察声音序列,其中每一帧都用一个12维的向量表示,色块的颜色深浅表示向量值的大小。
具体地,上述对每帧声音信号提取MFCC特征的具体过程为:通过对每个短时分析窗,以及快速傅里叶变换(Fast Fourier Transform,FFT)得到对应的频谱;将得到的频谱通过Mel滤波器组得到Mel频谱;在Mel频谱上面进行倒谱分析,通过取对数和离散余弦变化(discrete cosine transform,DCT)实现的逆变换,并取DCT后的第2个到第13个系数作为MFCC系数,获得Mel频率倒谱系数MFCC,即为该帧声音的特征。
进一步地,上述提取声音波形的声学特征的过程中,所述听觉处理模块3021除了采用提取梅尔频率倒谱系数(Mel Frequency Cepstral Coefficents,MFCC)的方式对所述声音信号提取声学特征外,还可以采用深度学习的方式提取所述声音信号的声学特征。
进一步地,如图6所示,所述听觉处理模块3021可基于隐Markov模型(HMM)构建声音识别模型,通过所述声音识别模型所述声音序列进行识别,得到所述声音序列的类别信息,并将所述类别信息编码输出,得到所述声音序列的听觉类别编码。具体地,所述听觉处理模块3021构建声音识别模型后,可采用所述第一分类信息字典中的声音信号对所述声音识别模型进行训练,得到所述声音识别模型的类型识别参数。所述声音识别模型根据所述类型识别参数对经MFCC特征提取后的声音序列进行识别,得到所述声音序列的听觉类别编码,所述听觉类别指的声音信号的类别,所述声音信号的类别可以依据符合标准或惯例的声音频率、声调变化周期而区分。
进一步地,所述第一分类信息字典中的声音信号为大量经标注过的道路环境声音信号,即可以是依据中国国家标准GB8108-1999规定的声音特征(即声音频率和变调周期)对警示笛声进行定义的标准对应关系数据。例如,如表1所示,声音频率在
Figure PCTCN2017082915-appb-000001
之间、变调周期在0.333-0.385之间的声音信号为紧急调频调,其对应的车型是警车。
表1声音频率和变调周期
Figure PCTCN2017082915-appb-000002
在实际应用中,本发明实施例提供的上述听觉处理模块3021,可以用于对其他物体的检测识别,只要将该物体发出的声音特征与物体的一一对应关系作为第一分类信息字典进行存储即可。
以上述国家标准为例,不同应急车辆有不同的警笛声音,因而在这种情况下,仅采用所述听觉处理模块2031即可独立的完成活跃的应急车辆的检测和识别。
但有时同一种警笛声音可能用于不同应急车辆,因此需要结合所述视觉处理模块3022,可以更加准确的检测和识别活跃的应急车辆。
以下为所述视觉处理模块3022的具体实施过程:
具体地,所述视觉处理模块3022可采用多目标检测算法检测所述视频图像中是否有应急车辆,得到应急车辆的视频帧。所述多目标检测算法可为:实时多目标检测算法SSD、YOLO等,以及R-CNN、Fast R-CNN或Faster R-CNN算法。
进一步地,如图7所示,所述视觉处理模块3022可采用卷积神经网络(convolutional neural network,CNN)结合深度卷积神经网络DCNN的算法将所述视频帧进行图像语义分割,将警示灯区域切分出来。具体地,先使用CNN提取图像中每个像素点的特征,再使用DCNN复原返回每个像素点在原图中的位置,实现像素级识别,输出警示灯区域的分割。
进一步地,如图8所示,所述视觉处理模块3022针对分割出来的警示灯区域的视频流,进行视频语义分析,检测和识别出警示灯的颜色和闪烁模式,进而得到视频图像的视觉类别编码。具体地,可以采用CNN结合循环神经网络(Recurrent neural Network,RNN)算法、时间递归神经网络(Long Short-Term Memory,LSTM)算法,针对多帧视频流中每一帧图像中已分割出来的感兴趣区域(Region of Interest,ROI),即警示灯区域,使用CNN提取每一帧图像中ROI区域的特征,包括当前帧中警示灯的不同色块、形状;针对提取出来的特征,使用双层LSTM实现的RNN网络提取视频中各帧的时序关系,输出警示灯的颜色变化特征,得到警示灯是否闪烁、色块颜色的特征,输出视觉类别编码。所述视觉类别指的是依据符合标准或惯例的警示灯颜色和颜色变化(即闪烁模式)而区分的类别。所述视觉类别编码即为视觉类别的编码。
本发明实施例中,所述视觉处理模块3022针对流视频,通过目标检测、ROI区域分割,将视频语义分析的区域限定到警灯的有限范围,降低了计算量和提高了识别速度,可适用于计算能力较弱的自动驾驶汽车。
此外,当不同应急车辆有不同警示灯和闪烁模式时,仅采用本发明实施例提供的视觉处理模块3022对视频图像的处理和识别的过程,即可独立的完成活跃的应急车辆的检测和识别。
实际应用中,本发明实施例提供的上述视觉处理模块3022,可以用于对其他物体的检测识别,只要将该物体的视觉特征与物体的一一对应关系作为第二分类信息字典进行存储即可。
进一步地,得到听觉类别编码和视觉类别编码后,所述访问索引生成模块3023根据所述所述听觉类别编码和/或所述视觉类别编码,生成访问索引。
进一步地,所述匹配单元303根据所述访问索引,当所述访问索引仅由所述听觉类别编码构成时,所述匹配单元303根据所述访问索引在第一分类信息字典中查找匹配出与所述听觉类别编码对应的应急车辆的类型和状态;当所述访问索引仅由所述视觉类别编码构 成时,所述匹配单元303根据所述访问索引,在第二分类信息字典中查找匹配出与所述视觉类别编码对应的应急车辆的类型和状态;当所述访问索引由所述听觉类别编码和所述视觉类别编码构成时,所述匹配单元303根据所述访问索引,在第一分类信息字典和第二分类信息字典中查找匹配出与所述听觉类别编码和视觉类别编码共同对应的应急车辆的类型和状态。
进一步地,所述匹配单元303访问所述存储单元304时的索引方式,可根据存储单元304对应急车辆类型及状态的存储方式而定,例如:
所述存储单元304可采用枚举的方式,预先存储车辆类型及其所有可能的状态。
如表2所示,应急车辆的车种及状态数据可使用二维表的方式进行存储,表2为N行2列的表格,表中每一行对应一种应急车辆及其状态,表中的一列表示应急车辆的种类,另一列表示应急车辆的状态。
表2应急车辆的一种存储列表
车型 状态
警车 不活跃
警车 活跃
消防车 不活跃
··· ···
当存储单元304采用如表2所示的存储方式时,访问索引生成模块3023将视觉类别编码与听觉类别编码,拼接为索引编码,对存储单元304进行访问,从而确定应急车辆的类型及状态。具体地,假设共有N类视觉类别信息和M类听觉类别信息,则视觉类别编和听觉类别编码分别为log2N位和log2M位,则访问存储单元的索引共有(log2N+log2M)位。
可选地,表3所示,所述存储单元304还可以使用二维表的方式应急车辆的车种及状态数据进行存储,表中每一项用二维向量表示,所述向量的两个分量分别为应急车辆的车种及其状态。
表3应急车辆的另一种存储列表
<警车,不活跃> <消防车,活跃>
<消防车,不活跃> <警车,活跃>
<救护车,不活跃> <工程抢险车,活跃>
··· ···
<工程抢险车,不活跃> <救护车,活跃>
当存储单元304采用如表3所示的存储方式时,访问索引生成模块3023将视觉类别编码和听觉类别编码分别作为行索引和列索引,对存储单元304进行访问,从而确定应急车辆的类型及状态。具体地,假设共有N类视觉类别信息和M类听觉类别信息,则视觉和听觉类别编码分别为log2N位和log2M位。
本发明实施例中,所述存储单元304所采用的如表2和表3所示的二维表存储方式,便于对应急车辆的类型和状态的存储进行扩展,可以通过存储单元内容的扩展而实现。
本发明实施例提供的上述图1和图3中的应急车辆的识别装置可应用于自动驾驶车辆以及其他交通设备,用于及时识别周边的应急车辆,也可应用于通过声音、视频进行识别的移动物体,便于及时规避所述应急车辆以及所述其他物体。
在本发明实施例中,当不同的应急车辆可能有相同的警示笛声,或者不同的应急车辆可能有相同的警示灯和相同的闪烁模式时,本发明实施例所提供的应急车辆识别装置,可结合听觉处理模块3021和视觉处理模块3022对多个模态数据的处理过程,高效准确的识别应急车辆的类型及活跃状态或不活跃状态,支撑自动驾驶的规划控制。
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可想到各种等效的修改或替换,这些修改或替换都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以权利要求的保护范围为准。

Claims (28)

  1. 一种应急车辆的识别方法,其特征在于,所述方法包括:
    获取应急车辆所处环境的声音信号和视频图像;
    将所述声音信号进行分析处理,得到所述声音信号的听觉类别编码,以及将所述视频图像进行分析处理,得到所述视频图像的视觉类别编码;
    根据所述声音信号的听觉类别编码和/或所述视频图像的视觉类别编码,确定所述应急车辆的类型和状态。
  2. 根据权利要求1所述的方法,其特征在于,所述将所述声音信号进行分析处理,得到所述声音信号的听觉类别编码,具体包括:
    将所述声音信号进行分帧处理;
    将所述经分帧处理后的声音信号进行声学特征提取,得到所述声音信号的声音序列;
    对所述声音信号的声音序列进行检测识别,得到所述声音信号的听觉类别编码。
  3. 根据权利要求2所述的方法,其特征在于,对所述声音信号的声音序列进行检测识别,得到所述声音信号的听觉类别编码,具体包括:
    构建声音识别模型;
    对所述声音识别模型进行训练,得到所述声音识别模型的类型识别参数;
    所述声音识别模型根据所述类型识别参数,对所述声音信号的声音序列进行检测识别,得到所述声音信号的听觉类别编码。
  4. 根据权利要求1至3任一权利要求所述的方法,其特征在于,所述对所述声音识别模型进行训练,具体包括:
    采用第一分类信息字典中的的声音信号对所述声音识别模型进行训练;所述第一分类信息字典至少包含应急车辆警示笛的声音特征与应急车辆的类型的对应关系信息、应急车辆警示笛的音响与应急车辆的状态的对应关系信息。
  5. 根据权利要求4所述的方法,其特征在于,所述在采用第一分类信息字典中的的声音信号对所述声音识别模型进行训练之前,所述方法还包括:
    根据应急车辆警示笛的声音特征与应急车辆的类型的对应关系、应急车辆警示笛的音响与应急车辆的状态的对应关系,构建所述第一分类信息字典,以及将所述第一分类信息字典存储;所述应急车辆警示笛的声音特征包括声音频率范围、变调周期。
  6. 根据权利要求1至5任一权利要求所述的方法,其特征在于,所述将所述视频图像进行分析处理,得到所述视频图像的视觉类别编码,具体包括:
    对所述视频图像进行检测,得到所述应急车辆的视频帧;
    将所述视频帧进行图像语义分割,得到所述应急车辆的警示灯区域;
    对所述应急车辆的警示灯区域进行视频语义分析,得到所述警示灯的颜色和闪烁模式;
    根据所述警示灯的颜色和闪烁模式,生成所述视频图像的视觉类别编码。
  7. 根据权利要求6所述的方法,其特征在于,在将所述视频图像进行分析处理,得到所述视频图像的视觉类别编码之前,所述方法还包括:
    根据应急车辆警示灯的发光特征与应急车辆的类型的对应关系,以及警示灯闪烁模式与应急车辆的状态的对应关系,构建所述第二分类信息字典,以及将所述第二分类信息字典存储;所述警示灯的发光特征包括警示灯的颜色、警示灯的闪烁模式。
  8. 根据权利要求5或7所述的方法,其特征在于,所述根据所述声音信号的听觉类别编码和/或所述视频图像的视觉类别编码,确定所述应急车辆的类型和状态,具体包括:
    根据所述听觉类别编码和/或所述视觉类别编码,生成访问索引;
    根据所述访问索引,从所述第一分类信息字典和/或第二分类信息字典中匹配出所述应急车辆的类型和状态。
  9. 根据权利要求8所述的方法,其特征在于,
    当所述访问索引包含所述听觉类别编码时,根据所述访问索引,在所述第一分类信息字典中查找匹配出与所述听觉类别编码对应的应急车辆的类型和状态;
    当所述访问索引包含所述视觉类别编码时,根据所述访问索引,在所述第二分类信息字典中查找匹配出与所述视觉类别编码对应的应急车辆的类型和状态;
    当所述访问索引包含所述听觉类别编码和所述视觉类别编码时,根据所述访问索引,在第一分类信息字典和第二分类信息字典中查找匹配出与所述听觉类别编码和所述视觉类别编码共同对应的应急车辆的类型和状态。
  10. 一种应急车辆的识别装置,其特征在于,包括:
    声音传感器,用于获取应急车辆所处环境的声音信号;
    光学传感器,用于获取应急车辆所处环境的视频图像;
    处理器,用于将所述声音信号进行分析处理,得到所述声音信号的听觉类别编码,以及将所述视频图像进行分析处理,得到所述视频图像的视觉类别编码;以及根据所述听觉类别编码和/或所述视觉类别编码,确定所述应急车辆的类型和状态;
    存储器,用于存储应急车辆的类型和状态信息。
  11. 根据权利要求10所述的装置,其特征在于,所述处理器具体用于,
    将所述声音信号进行分帧处理;
    将所述经分帧处理后的声音信号进行声学特征提取,得到所述声音信号的声音序列;
    对所述声音信号的声音序列进行检测识别,得到所述声音信号的听觉类别编码。
  12. 根据权利要求11所述的装置,其特征在于,所述处理器具体还用于,
    构建声音识别模型;
    对所述声音识别模型进行训练,得到所述声音识别模型的类型识别参数;
    所述声音识别模型根据所述类型识别参数,对所述声音信号的声音序列进行检测识别,得到所述声音信号的听觉类别编码。
  13. 根据权利要求10至12任一权利要求所述的装置,其特征在于,所述处理器读取第一分类信息字典中的的声音信号对所述声音识别模型进行训练;所述第一分类信息字典至少包含应急车辆警示笛的声音特征与应急车辆的类型的对应关系信息、应急车辆警示笛的音响与应急车辆的状态的对应关系信息。
  14. 根据权利要求13所述的装置,其特征在于,在对所述声音识别模型进行训练之前,所述处理器根据应急车辆警示笛的声音特征与应急车辆的类型的对应关系、应急车辆警示笛的音响与应急车辆的状态的对应关系,构建所述第一分类信息字典,以及将所述第一分类信息字典写入所述存储器;所述应急车辆警示笛的声音特征包括声音频率范围、变调周期。
  15. 根据权利要求10至14任一权利要求所述的装置,其特征在于,所述处理器具体还用于,
    对所述视频图像进行检测,得到所述应急车辆的视频帧;
    将所述视频帧进行图像语义分割,得到所述应急车辆的警示灯区域;
    对所述应急车辆的警示灯区域进行视频语义分析,得到所述警示灯的颜色和闪烁模式;
    根据所述警示灯的颜色和闪烁模式,生成所述视频图像的视觉类别编码。
  16. 根据权利要求15所述的装置,其特征在于,在将所述视频图像进行分析处理之前,所述处理器根据应急车辆警示灯的发光特征与应急车辆的类型的对应关系、警示灯的闪烁模式与应急车辆的状态的对应关系,构建所述第二分类信息字典,以及将所述第二分类信息字典写入所述存储器;所述警示灯的发光特征包括警示灯的颜色、警示灯的闪烁模式。
  17. 根据权利要求14或16所述的装置,其特征在于,所述处理器具体还用于,
    根据所述听觉类别编码和/或所述视觉类别编码,生成访问索引;
    根据所述访问索引,从所述第一分类信息字典和/或第二分类信息字典中匹配出所述应急车辆的类型和状态。
  18. 根据权利要求7所述的装置,其特征在于,所述处理器具体还用于,
    当所述访问索引包含所述听觉类别编码时,根据所述访问索引,在所述第一分类信息字典中查找匹配出与所述听觉类别编码对应的应急车辆的类型和状态;
    当所述访问索引包含所述视觉类别编码时,根据所述访问索引,在所述第二分类信息字典中查找匹配出与所述视觉类别编码对应的应急车辆的类型和状态;
    当所述访问索引包含所述听觉类别编码和所述视觉类别编码时,根据所述访问索引,在第一分类信息字典和第二分类信息字典中查找匹配出与所述听觉类别编码和所述视觉类别编码共同对应的应急车辆的类型和状态。
  19. 一种应急车辆的识别装置,其特征在于,包括:
    数据接收单元,用于获取应急车辆所处环境的声音信号和视频图像;
    多模态感知单元,用于将所述声音信号进行分析处理,得到所述声音信号的听觉类别编码,以及将所述视频图像进行分析处理,得到所述视频图像的视觉类别编码;
    匹配单元,用于根据所述声音信号的听觉类别编码和/或所述视频图像的视觉类别编码,确定所述应急车辆的类型和状态。
  20. 根据权利要求19所述的装置,其特征在于,所述多模态感知单元包括:听觉处理模块;所述听觉处理模块具体用于,
    将所述声音信号进行分帧处理;
    将所述经分帧处理后的声音信号进行声学特征提取,得到所述声音信号的声音序列;
    对所述声音信号的声音序列进行检测识别,得到所述声音信号的听觉类别编码。
  21. 根据权利要求20所述的装置,其特征在于,所述听觉处理模块具体还用于,
    构建声音识别模型;
    对所述声音识别模型进行训练,得到所述声音识别模型的类型识别参数;
    所述声音识别模型根据所述类型识别参数,对所述声音信号的声音序列进行检测识别,得到所述声音信号的听觉类别编码。
  22. 根据权利要求21所述的装置,其特征在于,所述听觉处理模块读取第一分类信息字典中的的声音信号对所述声音识别模型进行训练;所述第一分类信息字典至少包含应急车辆警示笛的声音特征与应急车辆的类型的对应关系信息、应急车辆警示笛的音响与应急车辆的状态的对应关系信息。
  23. 根据权利要求22所述的装置,其特征在于,在对所述声音识别模型进行训练之前,所述听觉处理模块根据应急车辆警示笛的声音特征与应急车辆的类型的对应关系、应急车辆警示笛的音响与应急车辆的状态的对应关系,构建所述第一分类信息字典,以及将所述第一分类信息字典存储;所述应急车辆警示笛的声音特征包括声音频率范围、变调周期。
  24. 根据权利要求19至23任一权利要求所述的装置,其特征在于,所述多模态感知单元还包括视觉处理模块;所述视觉处理模块用于,
    对所述视频图像进行检测,得到所述应急车辆的视频帧;
    将所述视频帧进行图像语义分割,得到所述应急车辆的警示灯区域;
    对所述应急车辆的警示灯区域进行视频语义分析,得到所述警示灯的颜色和闪烁模式;
    根据所述警示灯的颜色和闪烁模式,生成所述视频图像的视觉类别编码。
  25. 根据权利要求24所述的装置,其特征在于,在将所述视频图像进行分析处理之前,所述视觉处理模块根据应急车辆警示灯的发光特征与应急车辆的类型的对应关系、警示灯的闪烁模式与应急车辆的状态的对应关系,构建所述第二分类信息字典,以及将所述第二分类信息字典存储;所述警示灯的发光特征包括警示灯的颜色、警示灯的闪烁模式。
  26. 根据权利要求23或25所述的装置,其特征在于,所述多模态感知单元还包括访问索引生成模块;
    所述访问索引生成模块,用于根据所述听觉类别编码和/或所述视觉类别编码,生成访问索引;
    所述匹配单元根据所述访问索引,从所述第一分类信息字典和/或第二分类信息字典中匹配出所述应急车辆的类型和状态。
  27. 根据权利要求26所述的装置,其特征在于,
    当所述访问索引包含所述听觉类别编码时,所述匹配单元根据所述访问索引,在所述第一分类信息字典中查找匹配出与所述听觉类别编码对应的应急车辆的类型和状态;
    当所述访问索引包含所述视觉类别编码时,所述匹配单元根据所述访问索引,在所述第二分类信息字典中查找匹配出与所述视觉类别编码对应的应急车辆的类型和状态;
    当所述访问索引包含所述听觉类别编码和所述视觉类别编码时,所述匹配单元根据所述访问索引,在第一分类信息字典和第二分类信息字典中查找匹配出与所述听觉类别编码和所述视觉类别编码共同对应的应急车辆的类型和状态。
  28. 一种车,其特征在于,包括权利要求10-18、19-27任一权利要求所述的急车辆识别装置,所述应急车辆识别装置用于识别应急车辆的类型及状态。
PCT/CN2017/082915 2017-05-03 2017-05-03 一种应急车辆的识别方法及装置 WO2018201349A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201780082732.1A CN110168625A (zh) 2017-05-03 2017-05-03 一种应急车辆的识别方法及装置
PCT/CN2017/082915 WO2018201349A1 (zh) 2017-05-03 2017-05-03 一种应急车辆的识别方法及装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/082915 WO2018201349A1 (zh) 2017-05-03 2017-05-03 一种应急车辆的识别方法及装置

Publications (1)

Publication Number Publication Date
WO2018201349A1 true WO2018201349A1 (zh) 2018-11-08

Family

ID=64015806

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/082915 WO2018201349A1 (zh) 2017-05-03 2017-05-03 一种应急车辆的识别方法及装置

Country Status (2)

Country Link
CN (1) CN110168625A (zh)
WO (1) WO2018201349A1 (zh)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110175625A (zh) * 2019-04-11 2019-08-27 淮阴工学院 一种基于改进的ssd算法的微信群信息识别及管理方法
CN110765837A (zh) * 2019-08-30 2020-02-07 深圳市元征科技股份有限公司 一种违章鸣笛车辆的识别方法、装置及终端设备
GB2577734A (en) * 2018-10-05 2020-04-08 Continental Automotive Gmbh Emergency vehicle detection
CN111353444A (zh) * 2020-03-04 2020-06-30 上海眼控科技股份有限公司 标志灯具监测方法、装置、计算机设备和存储介质
CN111357011A (zh) * 2019-01-31 2020-06-30 深圳市大疆创新科技有限公司 环境感知方法、装置以及控制方法、装置和车辆
CN111882894A (zh) * 2020-09-01 2020-11-03 深圳市德惠模具有限公司 一种智能交通信号灯控制方法及系统
CN112132101A (zh) * 2020-09-30 2020-12-25 深圳道可视科技有限公司 泊车系统全景视图生成方法、装置、终端及可读存储介质
CN112562401A (zh) * 2020-11-05 2021-03-26 东风汽车集团有限公司 识别应急车辆后提醒、主动避让的方法
CN112633182A (zh) * 2020-12-25 2021-04-09 广州文远知行科技有限公司 一种车辆状态检测方法、装置、设备和存储介质
CN116013095A (zh) * 2023-03-24 2023-04-25 中国科学技术大学先进技术研究院 红绿灯时间动态控制方法、装置、设备及可读存储介质

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112180913A (zh) * 2020-09-01 2021-01-05 芜湖酷哇机器人产业技术研究院有限公司 特种车辆识别方法
CN112706691B (zh) * 2020-12-25 2022-11-29 奇瑞汽车股份有限公司 车辆提醒的方法和装置
CN113256998A (zh) * 2021-05-18 2021-08-13 哈尔滨翼成科技有限公司 一种特种车辆及其工作状态识别方法及设备
CN113763717A (zh) * 2021-08-31 2021-12-07 广州文远知行科技有限公司 一种车辆的识别方法、装置、计算机设备和存储介质
CN117392857B (zh) * 2023-10-13 2024-06-18 深圳市平安顺科技有限公司 一种基于蓝牙网络的货车车型识别系统及识别方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH06325291A (ja) * 1991-02-28 1994-11-25 Kenwood Corp 緊急車両検知警報装置および位置表示装置
JPH11306494A (ja) * 1998-04-15 1999-11-05 Matsushita Electric Ind Co Ltd 接近報知システム
JP2002245588A (ja) * 2001-02-13 2002-08-30 Toshiba Corp 緊急車両優先通過支援システム
CN101601076A (zh) * 2007-01-11 2009-12-09 小山有 警报显示系统
CN105608906A (zh) * 2016-03-29 2016-05-25 成都理工大学 高速公路机动车非法占用应急车道的监控系统及实现方法

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2500886B1 (en) * 2009-11-10 2018-01-10 Mitsubishi Electric Corporation Drive assist system, drive assist method, and vehicle-mounted device
CN115690558A (zh) * 2014-09-16 2023-02-03 华为技术有限公司 数据处理的方法和设备
US9278689B1 (en) * 2014-11-13 2016-03-08 Toyota Motor Engineering & Manufacturing North America, Inc. Autonomous vehicle detection of and response to emergency vehicles
CN105139481A (zh) * 2015-08-13 2015-12-09 石立公 一种汽车喇叭识别系统
CN105894841A (zh) * 2015-10-20 2016-08-24 乐卡汽车智能科技(北京)有限公司 紧急车辆提醒方法和装置
CN205121626U (zh) * 2015-11-17 2016-03-30 国家电网公司 一种电力应急救援系统
CN205354348U (zh) * 2016-01-21 2016-06-29 深圳泰首智能技术有限公司 通过警笛联动实现救援车辆交通执法的取证装置
CN105868700A (zh) * 2016-03-25 2016-08-17 哈尔滨工业大学深圳研究生院 一种基于监控视频的车型识别与跟踪方法及系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH06325291A (ja) * 1991-02-28 1994-11-25 Kenwood Corp 緊急車両検知警報装置および位置表示装置
JPH11306494A (ja) * 1998-04-15 1999-11-05 Matsushita Electric Ind Co Ltd 接近報知システム
JP2002245588A (ja) * 2001-02-13 2002-08-30 Toshiba Corp 緊急車両優先通過支援システム
CN101601076A (zh) * 2007-01-11 2009-12-09 小山有 警报显示系统
CN105608906A (zh) * 2016-03-29 2016-05-25 成都理工大学 高速公路机动车非法占用应急车道的监控系统及实现方法

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2577734A (en) * 2018-10-05 2020-04-08 Continental Automotive Gmbh Emergency vehicle detection
US11341753B2 (en) 2018-10-05 2022-05-24 Continental Automotive Gmbh Emergency vehicle detection
CN111357011A (zh) * 2019-01-31 2020-06-30 深圳市大疆创新科技有限公司 环境感知方法、装置以及控制方法、装置和车辆
CN111357011B (zh) * 2019-01-31 2024-04-30 深圳市大疆创新科技有限公司 环境感知方法、装置以及控制方法、装置和车辆
CN110175625A (zh) * 2019-04-11 2019-08-27 淮阴工学院 一种基于改进的ssd算法的微信群信息识别及管理方法
CN110765837A (zh) * 2019-08-30 2020-02-07 深圳市元征科技股份有限公司 一种违章鸣笛车辆的识别方法、装置及终端设备
CN111353444A (zh) * 2020-03-04 2020-06-30 上海眼控科技股份有限公司 标志灯具监测方法、装置、计算机设备和存储介质
CN111882894A (zh) * 2020-09-01 2020-11-03 深圳市德惠模具有限公司 一种智能交通信号灯控制方法及系统
CN112132101A (zh) * 2020-09-30 2020-12-25 深圳道可视科技有限公司 泊车系统全景视图生成方法、装置、终端及可读存储介质
CN112562401A (zh) * 2020-11-05 2021-03-26 东风汽车集团有限公司 识别应急车辆后提醒、主动避让的方法
CN112633182A (zh) * 2020-12-25 2021-04-09 广州文远知行科技有限公司 一种车辆状态检测方法、装置、设备和存储介质
CN116013095A (zh) * 2023-03-24 2023-04-25 中国科学技术大学先进技术研究院 红绿灯时间动态控制方法、装置、设备及可读存储介质

Also Published As

Publication number Publication date
CN110168625A (zh) 2019-08-23

Similar Documents

Publication Publication Date Title
WO2018201349A1 (zh) 一种应急车辆的识别方法及装置
US11409303B2 (en) Image processing method for autonomous driving and apparatus thereof
KR101912914B1 (ko) 전방 카메라를 이용한 속도제한 표지판 인식 시스템 및 방법
CN111931929B (zh) 一种多任务模型的训练方法、装置及存储介质
US20190130216A1 (en) Information processing apparatus, method for controlling information processing apparatus, and storage medium
CN113614730B (zh) 多帧语义信号的cnn分类
KR102015947B1 (ko) 자율주행을 위한 학습대상 이미지 추출 장치 및 방법
EP3523749B1 (en) Object detection and classification with fourier fans
CN107729843B (zh) 基于雷达与视觉信息融合的低地板有轨电车行人识别方法
JP2012094151A (ja) 手振り識別装置及び識別方法
CN112052815B (zh) 一种行为检测方法、装置及电子设备
JP6700373B2 (ja) ビデオ動画の人工知能のための学習対象イメージパッケージング装置及び方法
CN111008576B (zh) 行人检测及其模型训练、更新方法、设备及可读存储介质
US20230331250A1 (en) Method and apparatus for configuring deep learning algorithm for autonomous driving
WO2023231991A1 (zh) 交通信号灯感知方法、装置、设备及存储介质
JP2019220084A (ja) 解析装置、車載器、及びパターン解析補助装置
CN112906471A (zh) 一种交通信号灯识别方法及装置
Cheng et al. Spectrogram-based classification on vehicles with modified loud exhausts via convolutional neural networks
Ong et al. A Cow Crossing Detection Alert System.
JP6681965B2 (ja) 自律走行のための学習対象イメージ抽出装置及び方法
CN116182831A (zh) 车辆定位方法、装置、设备、介质及车辆
CN113221604B (zh) 目标识别方法、装置、存储介质及电子设备
CN113761981B (zh) 一种自动驾驶视觉感知方法、装置及存储介质
Lima et al. Visually impaired people positioning assistance system using artificial intelligence
US11830239B1 (en) Systems and methods for automatic extraction and alignment of labels derived from camera feed for moving sound sources recorded with a microphone array

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17908484

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17908484

Country of ref document: EP

Kind code of ref document: A1