CN116978361A - Prompting method, prompting device, prompting equipment and storage medium - Google Patents

Prompting method, prompting device, prompting equipment and storage medium Download PDF

Info

Publication number
CN116978361A
CN116978361A CN202310996686.3A CN202310996686A CN116978361A CN 116978361 A CN116978361 A CN 116978361A CN 202310996686 A CN202310996686 A CN 202310996686A CN 116978361 A CN116978361 A CN 116978361A
Authority
CN
China
Prior art keywords
audio information
prompt
spectrogram
prompting
voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310996686.3A
Other languages
Chinese (zh)
Inventor
杜宝林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China United Network Communications Group Co Ltd
Original Assignee
China United Network Communications Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China United Network Communications Group Co Ltd filed Critical China United Network Communications Group Co Ltd
Priority to CN202310996686.3A priority Critical patent/CN116978361A/en
Publication of CN116978361A publication Critical patent/CN116978361A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B21/00Alarms responsive to a single specified undesired or abnormal condition and not otherwise provided for
    • G08B21/18Status alarms
    • G08B21/24Reminder alarms, e.g. anti-loss alarms
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • G10L21/028Voice signal separating using properties of sound source
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Quality & Reliability (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Business, Economics & Management (AREA)
  • Emergency Management (AREA)
  • General Physics & Mathematics (AREA)
  • Emergency Alarm Devices (AREA)

Abstract

The application provides a prompting method, a device, equipment and a storage medium, which relate to the technical field of computers. The method comprises the following steps: acquiring initial audio information in a public transportation environment, identifying voice audio information in the initial audio information based on a voice recognition algorithm, filtering relevant audio information of a prompt in the voice audio information to obtain target audio information, and generating prompt information for prompting to inhibit loud speaker noise under the condition that the volume of the target audio information is larger than a first preset threshold value. The application can be used in the running process of public transportation to generate prompt information for prohibiting loud speaker.

Description

Prompting method, prompting device, prompting equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a prompting method, apparatus, device, and storage medium.
Background
Public places are prohibited from loud, if the plot is serious, public order is disturbed, public safety is disturbed, rights of other people are infringed, and public security management is penalized. With the development of public transportation, various public travel modes are also popular, and noise caused by loud sounds generated by electronic equipment such as talking, making a call and mobile phones in a closed space of public places such as buses has a certain influence on the bodies of chronically ill old people taking buses.
In the prior art, a noise reminding method in a public transportation environment is usually manually reminded by a driver according to actual conditions. However, the driver should concentrate on driving, and the driver's attention is scattered by the artificial reminding of loud speaker behavior, which affects safe driving.
Disclosure of Invention
The application provides a prompting method, a device, equipment and a storage medium, which realize that voice contents of passengers in a public transportation environment are not specifically identified, and prompt information for prohibiting loud noise is generated only based on high-volume audio information in the public transportation environment, so that personal privacy of the passengers is protected.
In a first aspect, the present application provides a prompting method applied to a broadcasting system for playing a prompting message in a public transportation environment, the method comprising: acquiring initial audio information in a public transportation environment; identifying the voice audio information in the initial audio information based on a voice identification algorithm; filtering relevant audio information of a prompt in the voice audio information to obtain target audio information; and generating prompt information for prompting to inhibit loud crowd when the volume of the target audio information is larger than a first preset threshold value.
According to the prompting method provided by the application, the initial audio information in the public transportation environment is acquired, the voice audio information in the initial audio information is identified based on a voice recognition algorithm, further, the relevant audio information of the prompting language in the voice audio information is filtered to obtain the target audio information, and under the condition that the volume of the target audio information is larger than a first preset threshold value, prompting information for prompting to prohibit loud crowd is generated. The prompting method of the application obtains the target audio system information by simple identification and filtering operation of the initial audio information, automatically generates the prompting information, and realizes automatic prompting of loud noise in public transportation environment. In addition, the prompting method does not relate to a complex language model or an acoustic model, and is simple to realize and has universality compared with the prior art. In addition, the prompting method does not need to carry out semantic analysis on the audio information of the passengers, only needs to identify whether the high-volume audio information of the passengers exists in the public transportation environment, further generates prompting information for prohibiting loud noise, and protects the personal privacy of the passengers.
A possible implementation manner, filtering related information of a prompt in voice audio information to obtain target audio information, including: acquiring relevant audio information of a prompt in a public transportation environment; converting the related audio information into a first spectrogram; converting the voice audio information into a second spectrogram; performing similarity detection on the first spectrogram and the second spectrogram based on an image processing technology, and filtering out a part which is the same as the first spectrogram from the second spectrogram to obtain a third spectrogram; the third spectrogram is converted into target audio information.
In another possible implementation manner, filtering related information of a prompt in voice audio information to obtain target audio information includes: acquiring relevant audio information of a prompt in a public transportation environment; determining a target frequency range of the related audio information; and filtering the audio information in the target frequency range in the voice audio information by adopting a band elimination filter to obtain target audio information.
In another possible implementation manner, the voice audio information is audio information with volume greater than a second preset threshold value after being processed based on a level filtering algorithm.
In another possible implementation manner, acquiring initial audio information in a public transportation environment includes: initial audio information in a public transportation environment is acquired by a low-sensitivity sound sensor.
Yet another possible implementation, the prompt includes at least one of: stop prompt, seat offering prompt, vehicle departure prompt, in-vehicle safety prompt and common prompt of drivers.
In a second aspect, the present application provides a reminder device comprising: the device comprises an acquisition module, an identification module, a filtering module and a generation module.
The acquisition module is used for acquiring initial audio information in the public transportation environment; the recognition module is used for recognizing the voice audio information in the initial audio information based on a voice recognition algorithm; the filtering module is used for filtering relevant audio information of the prompt in the voice audio information to obtain target audio information; the generation module is used for generating prompt information for prompting to prohibit loud crowd when the volume of the target audio information is larger than a first preset threshold value.
One possible implementation manner, the filtering module is specifically configured to obtain relevant audio information of a prompt in a public transportation environment; converting the related audio information into a first spectrogram; converting the voice audio information into a second spectrogram; performing similarity detection on the first spectrogram and the second spectrogram based on an image processing technology, and filtering out a part which is the same as the first spectrogram from the second spectrogram to obtain a third spectrogram; the third spectrogram is converted into target audio information.
In another possible implementation manner, the filtering module is specifically configured to obtain relevant audio information of a prompt in a public transportation environment; determining a target frequency range of the related audio information; and filtering the audio information in the target frequency range in the voice audio information by adopting a band elimination filter to obtain target audio information.
In another possible implementation manner, the voice audio information is audio information with volume greater than a second preset threshold value after being processed based on a level filtering algorithm.
In another possible implementation manner, the voice audio information is audio information with volume greater than a second preset threshold value after being processed based on a level filtering algorithm.
Yet another possible implementation, the prompt includes at least one of: stop prompt, seat offering prompt, vehicle departure prompt, in-vehicle safety prompt and common prompt of drivers.
In a third aspect, the present application provides an electronic device comprising: a processor and a memory; the memory stores instructions executable by the processor; the processor is configured to execute the instructions to cause the electronic device to implement the method of the first aspect described above.
In a fourth aspect, the present application provides a computer-readable storage medium comprising: computer software instructions; the computer software instructions, when run in an electronic device, cause the electronic device to implement the method of the first aspect described above.
In a fifth aspect, the present application provides a computer program product which, when run on a computer, causes the computer to perform the steps of the related method described in the first aspect above to carry out the method of the first aspect above.
Advantageous effects of the second aspect to the fifth aspect described above refer to corresponding descriptions of the first aspect, and are not repeated.
Drawings
FIG. 1 is a schematic view of an application environment of a prompting method provided by the present application;
FIG. 2 is a schematic flow chart of a prompting method provided by the application;
FIG. 3 is a schematic flow chart of another prompting method provided by the application;
FIG. 4 is a schematic flow chart of another prompting method according to the present application;
FIG. 5 is a schematic flow chart of another prompting method according to the present application;
FIG. 6 is a schematic diagram of a prompting device according to the present application;
fig. 7 is a schematic diagram of an electronic device according to the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
It should be noted that, in the embodiments of the present application, words such as "exemplary" or "such as" are used to mean serving as an example, instance, or illustration. Any embodiment or design described herein as "exemplary" or "for example" is not necessarily to be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present related concepts in a concrete fashion.
In order to clearly describe the technical solution of the embodiments of the present application, in the embodiments of the present application, the terms "first", "second", etc. are used to distinguish the same item or similar items having substantially the same function and effect, and those skilled in the art will understand that the terms "first", "second", etc. are not limited in number and execution order.
Public places are prohibited from loud, if the plot is serious, public order is disturbed, public safety is disturbed, rights of other people are infringed, and public security management is penalized. In a closed space of public places such as buses, noise of loud sounds generated by electronic equipment such as loud speaking, making a call and mobile phones has a certain influence on the bodies of chronically ill old people taking buses. However, the driver should concentrate on driving, and the manual playing of the warning alert may distract the driver and affect the safe driving. Moreover, the loud speaker is sporadic, the warning prompt of the forbidden speaker is repeatedly played, the pertinence is not provided, and the warning prompt is repeatedly played, and the warning prompt is noise.
In the related technology, the technology for identifying and analyzing the illegal behaviors of the pedestrians of the subway passengers is to extract audio and framing video from subway monitoring images, screen voices and illegal behaviors from a preset library, and finally play reminding warning voices through a camera. The technology for monitoring the external sound of the electronic equipment of the subway passenger is characterized in that after the collected sound source is subjected to human voice non-human voice recognition, the decoding recognition content of the human voice is distinguished, and the subway monitoring system is connected in parallel to confirm the type of the target sound source through manual video monitoring. And when the measuring personnel confirms that the target sound source is caused by the external release of the passenger electronic equipment through video monitoring contrast, a warning language is played or the personnel on duty at the intercom contact station performs persuasion. The technical scheme of intelligent analysis of illegal behaviors of subway passengers or voice monitoring of electronic equipment is required to recognize voice semantics and is combined with video monitoring images, and the technical scheme is complex, high in cost and not universal. Moreover, recognition and analysis of voices in public places is suspected of monitoring and invading personal privacy.
In addition, since there is a lot of noise, multipath reflection and reverberation in the environment of the bus, the quality of the picked-up voice signal is degraded, which seriously affects the voice recognition rate. Without acoustic processing support, such as adaptation of microphone array technology, the real scene recognition rate is actually less than 60%. In addition, as the real scene always has a plurality of sound sources and the superposition of environmental noise, the surrounding noise interference and the scene of simultaneous speaking of a plurality of people often occur, and the difficulty of voice recognition is increased. Because the current speech recognition engines are in a single recognition mode, the problem of multi-person recognition cannot be processed simultaneously. Microphone arrays are currently the main approach to solve the above problems, but microphone arrays have a number of drawbacks, one of which is the high requirement for hardware, including microphones and chip devices, resulting in increased cost.
In view of the foregoing, there is a need to provide a method for conveniently prompting a passenger to prohibit loud speaker noise in a public transportation scene without identifying the voice content of the passenger. The prompting device obtains the target audio system information by simple identification and filtering operation on the initial audio information, automatically generates prompting information, and realizes automatic prompting of loud noise in the public transportation environment. In addition, the embodiment of the application does not relate to a complex language model or an acoustic model, and compared with the prior art, the embodiment of the application has simple realization and universality. In addition, the prompting device does not need to carry out semantic analysis on the audio information of the passengers, only needs to identify whether the high-volume audio information of the passengers exists in the public transportation environment, and further generates prompting information for prohibiting loud noise, so that the personal privacy of the passengers is protected.
The prompting method provided by the application can be applied to an application environment shown in figure 1. As shown in fig. 1, the application environment includes: the prompting device 101 and the audio acquisition device 102 are connected with each other, wherein the prompting device 101 and the audio acquisition device 102 are connected with each other.
In some embodiments, the prompting device 101 may be a server cluster formed by a plurality of servers, or a single server, or a computer, or a processor or a processing chip in a server or a computer, or the like. The embodiment of the present application does not limit the specific device configuration of the presentation device 101. In fig. 1, the presentation device 101 is shown as a single server.
In some embodiments, the audio acquisition device 102 may be a device with the capability to acquire audio information, such as: microphones, recorders, audio collectors, pickups, sound sensors, etc. The embodiment of the present application does not limit the specific device configuration of the audio acquisition device 102. An audio acquisition device 102 is illustrated in fig. 1 as a microphone.
In some embodiments, when it is required to prompt that loud speaker is prohibited in the public transportation environment, the prompting device 101 may acquire initial audio information in the public transportation environment through the audio acquisition device 102, identify and filter the initial audio information to obtain target audio information, and generate, when the volume of the target audio information is greater than a first preset threshold, prompt information for prompting that loud speaker is prohibited. The prompting device 101 may also send the generated prompting information to a broadcasting system, and play or display the prompting information in the public transportation environment through the broadcasting system.
Fig. 2 is a schematic flow chart of a prompting method according to an embodiment of the present application. As shown in fig. 2, the prompting method provided by the present application may be implemented by the foregoing prompting device, and specifically includes the following steps:
s201, the prompting device acquires initial audio information in the public transportation environment.
In some embodiments, the prompting device may acquire the initial audio information in the public transportation environment through the low-sensitivity sound sensor, so that the initial audio information may be processed to obtain the target audio information.
By way of example, since the embodiment of the application processes only the audio information with higher volume in the public transportation environment, the prompting device can select the sound sensor with low sensitivity (such as a microphone with low sensitivity, a recorder and the like) to acquire the initial audio information with higher volume in the public transportation environment without collecting other irrelevant audio information. The low-sensitivity sound sensor selects low output and outputs no amplified sound sensor, and for the relatively fine audio information output electric signal, only the high-volume audio information output electric signal is ignored, or the low-volume electric signal is filtered out by a special circuit design, only the high-volume electric signal is reserved, so that the difficulty in processing the initial audio information is simplified.
It will be appreciated that the acoustic sensor may be used to receive sound waves, display a vibrational image of the sound, but does not measure the intensity of the noise, and incorporates a capacitive electret microphone that is sensitive to the sound, the sound waves vibrating the electret film in the microphone, causing a change in capacitance, and producing a minute voltage corresponding to the change, which is converted and transmitted to the prompting device.
S202, the prompting device recognizes the voice audio information in the initial audio information based on a voice recognition algorithm.
In some embodiments, after the prompting device obtains the initial audio information of the public transportation environment, the voice audio information in the initial audio information can be further identified based on a voice recognition algorithm.
The prompting device classifies multiple complex audio information in the initial audio information based on a voice recognition algorithm, and obtains the audio information matched with the human speaking voice from the multiple complex audio information based on the frequency range of the human speaking voice as the human voice audio information. When classifying multiple complex audio information in the initial audio information based on a voice recognition algorithm, the prompting device classifies the multiple complex audio information only according to the frequency of the audio information and does not perform semantic analysis, so that the privacy of passengers is protected.
In some embodiments, the human voice audio information is audio information with a volume greater than a second preset threshold after being processed based on a level filtering algorithm.
It should be appreciated that the prompting device may filter audio information in the initial audio information that has a volume not greater than a second preset threshold based on a level filtering algorithm before identifying the human voice audio information in the initial audio information. Then, the prompting device identifies the voice audio information in the filtered initial audio information. Or, the prompting device may also filter the audio information with the volume not greater than the second preset threshold value in the voice audio information based on the level filtering algorithm after identifying the voice information in the initial audio information. The prompting device only processes the audio information with the volume larger than the second preset threshold value in the acquired initial audio information, so that the phenomenon of loud noise in the public transportation environment is prompted. The level filtering algorithm is a conversion formula between the level of the electric signal of the audio information and the decibel, the level of the initial audio information or the audio information of the voice is converted into the corresponding decibel by the prompting device, and the corresponding level of the decibel which is not more than the second preset threshold value is obtained by comparing the corresponding level with the second preset threshold value, so that the audio information corresponding to the level in the initial audio information or the audio information of the voice is filtered. The second preset threshold is a volume threshold set by the manager according to the actual situation, for example, 40 db or 50 db, which is not limited in the embodiment of the present application.
S203, the prompting device filters relevant audio information of the prompt in the voice audio information to obtain target audio information.
Wherein the prompt includes at least one of: stop prompt, seat offering prompt, vehicle departure prompt, in-vehicle safety prompt and common prompt of drivers.
In some embodiments, after the prompting device identifies the voice audio information in the initial audio information based on the voice recognition algorithm, relevant audio information of a prompt in the voice audio information can be filtered to obtain the target audio information.
In one possible implementation manner, the prompting device converts the voice audio information and the related audio information of the prompt into a spectrogram, and filters the related audio information of the prompt in the voice audio information based on the spectrogram to obtain the target audio information. Specifically, as shown in fig. 3, S203 may be implemented as follows S301-S305:
s301, the prompting device acquires relevant audio information of a prompting message in a public transportation environment.
In some embodiments, the database of the broadcasting system of public transportation stores the related audio information of the public transportation prompt, and the prompting device may acquire the related audio information of the prompt from the database of the broadcasting system.
For example, the relevant audio information of the prompt language stored in the database of the broadcasting system may be different due to different running time and running route of each public transportation, the prompting device may store the relevant audio information of the prompt language acquired from the broadcasting system in its own database, and update the relevant audio information of the prompt language acquired each time in the database, so as to construct a relevant audio information database of the prompt language, so that the prompting device may directly acquire the relevant audio information of the prompt language from the database.
S302, the prompting device converts the related audio information into a first spectrogram.
In some embodiments, the prompting device converts the acquired related audio information of the prompt into the first spectrogram based on a fourier transform algorithm.
For example, the prompting device may calculate a spectrogram from the electrical signal of the associated audio information based on a fourier transform algorithm. Wherein the fourier transform algorithm divides the time of the audio information into blocks (usually overlapping), calculates the spectral amplitude of each block by fourier transform, measures the amplitude and frequency at a specific point in time (midpoint of each block) corresponding to one vertical line in the spectrogram, and then forms an image or a three-dimensional image on the time axis from the measured amplitude and frequency.
It should be understood that, in the spectrogram, the horizontal axis is time (in seconds) and the vertical axis is frequency (in hertz), which represents the variation of the amplitude of the signal at different frequencies over time, and the different colors represent the amplitude of each frequency.
S303, the prompting device converts the voice audio information into a second spectrogram.
In some embodiments, the prompting device converts the human voice audio information to a second spectrogram based on a fourier transform algorithm.
And S304, the prompting device detects the similarity of the first spectrogram and the second spectrogram based on an image processing technology, and filters out the part which is the same as the first spectrogram from the second spectrogram to obtain a third spectrogram.
In some embodiments, after the prompting device obtains the first spectrogram of the related audio information and the second spectrogram of the human voice audio information, similarity detection may be performed on the first spectrogram and the second spectrogram based on an image processing technology, and a portion identical to the first spectrogram is filtered out from the second spectrogram to obtain a third spectrogram.
The prompting device respectively extracts first characteristic points and second characteristic points of the first spectrogram and the second spectrogram based on an image processing technology, matches the first characteristic points and the second characteristic points to determine repeated characteristic points, and determines overlapping areas of the first spectrogram and the second spectrogram based on a homography matrix algorithm and the repeated characteristic points. The prompting device deletes the overlapped area in the second spectrogram based on the image processing technology to obtain a third spectrogram.
S305, the prompting device converts the third sound spectrogram into target audio information.
In some embodiments, after the prompting device obtains the third spectrogram, the third spectrogram may be converted into the target audio information based on an inverse fourier transform algorithm.
The prompting device converts the third spectrogram into a time domain waveform chart based on an inverse Fourier transform algorithm, and then linearly transforms coordinates corresponding to waveform pixel points in the time domain waveform chart according to time sequence to obtain energy points, and obtains target audio information based on the energy points.
In another possible implementation, the prompting device adopts a band-reject filter to filter the relevant audio information of the prompt in the voice audio information by determining a target frequency range of the relevant audio information of the prompt. Specifically, as shown in fig. 4, S203 may be implemented as follows S401 to S403:
s401, the prompting device acquires relevant audio information of a prompt in a public transportation environment.
The description of S401 may be referred to the description of S301, and the detailed description is not repeated here.
S402, the prompting device determines a target frequency range of the related audio information.
In some embodiments, after the reminder device obtains relevant audio information for the reminder in the public transportation environment, a target frequency range for the relevant audio information may be determined.
Illustratively, the prompting device determines a peak value of the electrical signal of the audio information at each time point in the related audio information based on a fast fourier transform algorithm, and determines a frequency of the audio information at each time point based on a frequency calculation formula, thereby determining the target frequency range.
S403, the prompting device adopts a band elimination filter to filter the audio information in the target frequency range in the voice audio information to obtain target audio information.
In some embodiments, after the prompting device determines the target frequency range of the related audio information, a band-stop filter may be used to filter the audio information in the target audio range in the voice audio information to obtain the target audio information.
It should be understood that a band-stop filter refers to a filter that can suppress a frequency component of a certain stop band and allow a frequency component outside the stop band to pass, and in an actual circuit, a low-pass filter and a high-pass filter are usually connected in parallel to form a band-stop filter circuit.
S204, when the volume of the target audio information is larger than a first preset threshold value, the prompting device generates prompting information for prompting prohibition of loud noise.
In some embodiments, after the prompting device obtains the target audio information, the prompting device may generate the prompting information for prompting to prohibit loud speaker when the volume of the target audio information is greater than the first preset threshold. The first preset threshold is a volume threshold set by a manager according to actual conditions, which is not limited in the embodiment of the present application. In addition, the first preset threshold is greater than the second preset threshold.
It should be understood that, after the prompting device generates the prompting message for prohibiting loud speaker, the prompting message may also be sent to the broadcasting system, and the passenger is prompted by the broadcasting system in a mode of voice playing or text displaying. When the prompt information is played in the public transportation environment in a voice playing mode, because the broadcasting system of the public transportation needs to play other related prompt messages, the broadcasting system plays the prompt information according to a playing rule, wherein the playing priority of the playing rule is as follows from high to low: stop report prompt, seat offering prompt, vehicle departure prompt, in-vehicle safety prompt and prompt information. When the prompt information is displayed on the electronic display screen of the public transportation environment in a rolling way in a text display mode, the electronic display screen of the public transportation broadcasting system is required to play site information and safety prompts. Therefore, the broadcasting system also needs to display the prompt information according to the display rule. Wherein, the display priority of the display rule is from high to low as follows: site information, security prompt and prompt information.
The technical scheme provided by the embodiment of the application has at least the following beneficial effects that the prompting method provided by the embodiment of the application has the advantages that the prompting device recognizes the voice audio information in the initial audio information based on the voice recognition algorithm by acquiring the initial audio information in the public transportation environment, further filters the related audio information of the prompting language in the voice audio information to obtain the target audio information, and generates the prompting information for prompting to prohibit loud speaker when the volume of the target audio information is larger than the first preset threshold. The prompting device obtains the target audio system information by simple identification and filtering operation on the initial audio information, automatically generates prompting information, and realizes automatic prompting of loud noise in the public transportation environment. In addition, the embodiment of the application does not relate to a complex language model or an acoustic model, and compared with the prior art, the embodiment of the application has simple realization and universality. In addition, the prompting device does not need to carry out semantic analysis on the audio information of the passengers, only needs to identify whether the high-volume audio information of the passengers exists in the public transportation environment, and further generates prompting information for prohibiting loud noise, so that the personal privacy of the passengers is protected.
Furthermore, the prompting device can also identify the voice audio information in the initial audio information through the spectrogram or the band elimination filter, so that only the voice audio information is processed, the condition that the voice is prohibited from being loud and loud due to other noise in the public transportation environment is avoided, and the prompting accuracy is improved.
The following describes a prompting method according to an embodiment of the present application, and a specific implementation process of the method is shown in fig. 5.
The prompting device acquires initial audio information in public transportation environment through a microphone or other pickup sensors, filters out low-decibel voice in the initial audio information based on an electric frequency filtering algorithm, recognizes voice audio information in the filtered initial audio information based on a voice recognition algorithm, further performs voice preprocessing on the voice audio information, namely, the prompting device respectively extracts prompting voice information such as stop/offer of public transportation and voice characteristics of the voice audio information, filters prompting voice information in the voice audio information based on the voice characteristics to obtain target audio information, and finally generates prompting information for prohibiting loud noise when the volume of the target audio information is larger than a preset threshold.
The prompting device can also multiplex a broadcasting system, a power supply module, an intelligent scheduling module and a communication system of public transportation, the prompting device can send the generated prompting information which prohibits loud noise to the intelligent scheduling module of public transportation, and the intelligent scheduling module can play or display the prompting information which prohibits loud noise through the broadcasting system based on the play control rules of various prompting sounds.
In an exemplary embodiment, the application further provides a prompting device. The prompting device may include one or more functional modules for implementing the prompting method of the above method embodiments.
For example, fig. 6 is a schematic diagram of a prompting device according to an embodiment of the present application. As shown in fig. 6, the prompting device includes: an acquisition module 601, an identification module 602, a filtering module 603 and a generation module 604.
The acquisition module 601 is configured to acquire initial audio information in a public transportation environment. The recognition module 602 is configured to recognize the vocal audio information in the initial audio information based on a voice recognition algorithm. The filtering module 603 is configured to filter relevant audio information of a prompt in the voice audio information to obtain target audio information. The generating module 604 is configured to generate, when the volume of the target audio information is greater than a first preset threshold, a prompt message for prompting to prohibit loud speaker.
In some embodiments, the filtering module 603 is specifically configured to obtain relevant audio information of a prompt in a public transportation environment, convert the relevant audio information into a first spectrogram, convert the voice audio information into a second spectrogram, perform similarity detection on the first spectrogram and the second spectrogram based on an image processing technology, filter out a part identical to the first spectrogram from the second spectrogram, obtain a third spectrogram, and convert the third spectrogram into the target audio information.
In other embodiments, the filtering module 603 is specifically configured to obtain relevant audio information of a prompt in a public transportation environment, determine a target frequency range of the relevant audio information, and filter audio information in the target frequency range of the voice audio information by using a band-stop filter to obtain the target audio information.
In still other embodiments, the human voice audio information is audio information with a volume greater than a second preset threshold after being processed based on a level filtering algorithm.
In still other embodiments, the acquisition module 601 is specifically configured to acquire initial audio information in a public transportation environment through a low sensitivity sound sensor.
In still other embodiments, the hint comprises at least one of: stop prompt, seat offering prompt, vehicle departure prompt, in-vehicle safety prompt and common prompt of drivers.
In an exemplary embodiment, the embodiment of the application further provides an electronic device, which may be the prompting device in the above method embodiment. Fig. 7 is a schematic diagram of an electronic device according to an embodiment of the present application. As shown in fig. 7, the electronic device may include: a processor 701 and a memory 702; memory 702 stores instructions executable by processor 701; the processor 701 is configured to execute instructions that, when executed, cause an electronic device or network device or manager to implement a method as described in the foregoing method embodiments.
In an exemplary embodiment, embodiments of the application also provide a computer-readable storage medium having stored thereon computer program instructions; the computer program instructions, when executed by a computer, cause the computer to implement the method as described in the previous embodiments. The computer readable storage medium may be a non-transitory computer readable storage medium, for example, a ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.
In an exemplary embodiment, the present application also provides a computer program product, which when run on a computer causes the computer to perform the above-mentioned related method steps to implement the prompting method in the above-mentioned embodiments.
The present application is not limited to the above embodiments, and any changes or substitutions within the technical scope of the present application should be covered by the scope of the present application. Therefore, the protection scope of the present application should be subject to the protection scope of the claims.

Claims (14)

1. The prompting method is characterized by being applied to a broadcasting system for playing a prompting message in a public transportation environment; the method comprises the following steps:
acquiring initial audio information in the public transportation environment;
identifying the voice audio information in the initial audio information based on a voice identification algorithm;
filtering relevant audio information of the prompt in the voice audio information to obtain target audio information;
and generating prompt information for prompting to inhibit loud crowd when the volume of the target audio information is larger than a first preset threshold value.
2. The method according to claim 1, wherein said filtering the relevant information of the prompt in the voice audio information to obtain target audio information includes:
acquiring relevant audio information of the prompt in the public transportation environment;
converting the related audio information into a first spectrogram;
converting the voice audio information into a second spectrogram;
performing similarity detection on the first spectrogram and the second spectrogram based on an image processing technology, and filtering out a part which is the same as the first spectrogram from the second spectrogram to obtain a third spectrogram;
and converting the third sound spectrum graph into the target audio information.
3. The method according to claim 1, wherein said filtering the relevant information of the prompt in the voice audio information to obtain target audio information includes:
acquiring relevant audio information of the prompt in the public transportation environment;
determining a target frequency range of the related audio information;
and filtering the audio information positioned in the target frequency range in the voice audio information by adopting a band elimination filter to obtain the target audio information.
4. A method according to any one of claims 1-3, wherein the human voice audio information is audio information with a volume greater than a second preset threshold value after processing based on a level filtering algorithm.
5. The method of claim 1, wherein the obtaining initial audio information in the public transportation environment comprises:
the initial audio information in the public transportation environment is acquired by a low-sensitivity sound sensor.
6. A method according to any one of claims 1-3, wherein the cue comprises at least one of: stop prompt, seat offering prompt, vehicle departure prompt, in-vehicle safety prompt and common prompt of drivers.
7. A prompting device, which is characterized by being applied to a broadcasting system for playing a prompting message in a public transportation environment; the device comprises: the device comprises an acquisition module, an identification module, a filtering module and a generation module;
the acquisition module is used for acquiring initial audio information in the public transportation environment;
the recognition module is used for recognizing the voice audio information in the initial audio information based on a voice recognition algorithm;
the filtering module is used for filtering relevant audio information of the prompt in the voice audio information to obtain target audio information;
the generating module is used for generating prompt information for prompting to prohibit loud noise when the volume of the target audio information is larger than a first preset threshold value.
8. The apparatus of claim 7, wherein the filtering module is specifically configured to obtain relevant audio information of the prompt in the public transportation environment; converting the related audio information into a first spectrogram; converting the voice audio information into a second spectrogram; performing similarity detection on the first spectrogram and the second spectrogram based on an image processing technology, and filtering out a part which is the same as the first spectrogram from the second spectrogram to obtain a third spectrogram; and converting the third sound spectrum graph into the target audio information.
9. The apparatus of claim 7, wherein the filtering module is specifically configured to obtain relevant audio information of the prompt in the public transportation environment; determining a target frequency range of the related audio information; and filtering the audio information positioned in the target frequency range in the voice audio information by adopting a band elimination filter to obtain the target audio information.
10. The apparatus according to any one of claims 7-9, wherein the human voice audio information is audio information with a volume greater than a second preset threshold value after processing based on a level filtering algorithm.
11. The apparatus of claim 7, wherein the acquisition module is configured to acquire the initial audio information in the public transportation environment via a low sensitivity sound sensor.
12. The apparatus of any of claims 7-9, wherein the cue comprises at least one of: stop prompt, seat offering prompt, vehicle departure prompt, in-vehicle safety prompt and common prompt of drivers.
13. An electronic device, the electronic device comprising: a processor and a memory;
the memory stores instructions executable by the processor;
the processor is configured to, when executing the instructions, cause the electronic device to implement the method of any one of claims 1-6.
14. A computer-readable storage medium, the computer-readable storage medium comprising: computer software instructions;
when run in an electronic device of computer software instructions, cause the electronic device to implement the method of any one of claims 1-6.
CN202310996686.3A 2023-08-08 2023-08-08 Prompting method, prompting device, prompting equipment and storage medium Pending CN116978361A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310996686.3A CN116978361A (en) 2023-08-08 2023-08-08 Prompting method, prompting device, prompting equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310996686.3A CN116978361A (en) 2023-08-08 2023-08-08 Prompting method, prompting device, prompting equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116978361A true CN116978361A (en) 2023-10-31

Family

ID=88483053

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310996686.3A Pending CN116978361A (en) 2023-08-08 2023-08-08 Prompting method, prompting device, prompting equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116978361A (en)

Similar Documents

Publication Publication Date Title
JP6761458B2 (en) Use of external acoustics to alert vehicle occupants of external events and mask in-vehicle conversations
CN107985225B (en) Method for providing sound tracking information, sound tracking apparatus and vehicle having the same
EP3591633B1 (en) Surveillance system and surveillance method using multi-dimensional sensor data
AU2014101406A4 (en) A portable alerting system and a method thereof
DE102018130115B4 (en) Device and method for context-based suppression and amplification of acoustic signals in acoustic environments
CN105452822A (en) Sound event detecting apparatus and operation method thereof
CN106875678A (en) A kind of vehicle whistle law enforcement evidence-obtaining system
KR20180066509A (en) An apparatus and method for providing visualization information of a rear vehicle
CN109345834A (en) The illegal whistle capture systems of motor vehicle
Carmel et al. Detection of alarm sounds in noisy environments
KR101519255B1 (en) Notification System for Direction of Sound around a Vehicle and Method thereof
US10567904B2 (en) System and method for headphones for monitoring an environment outside of a user's field of view
CN114067782A (en) Audio recognition method and device, medium and chip system thereof
CN111816199A (en) Environmental sound control method and system for intelligent cabin of automobile
CN111081275A (en) Terminal processing method and device based on sound analysis, storage medium and terminal
CN116978361A (en) Prompting method, prompting device, prompting equipment and storage medium
CN108172240A (en) A kind of sound detection method, apparatus
van Hengel et al. Verbal aggression detection in complex social environments
CN113409800A (en) Processing method and device for monitoring audio, storage medium and electronic equipment
CN111933174B (en) Voice processing method, device, equipment and system
CN112706691B (en) Vehicle reminding method and device
CN109271480A (en) Voice question searching method and electronic equipment
KR101660306B1 (en) Method and apparatus for generating life log in portable termianl
CN114125128A (en) Anti-eavesdropping recording method, device and terminal
KR101748270B1 (en) Method for providing sound detection information, apparatus detecting sound around vehicle, and vehicle including the same

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination