CN110660201A - Arrival reminding method, device, terminal and storage medium - Google Patents

Arrival reminding method, device, terminal and storage medium Download PDF

Info

Publication number
CN110660201A
CN110660201A CN201910897452.7A CN201910897452A CN110660201A CN 110660201 A CN110660201 A CN 110660201A CN 201910897452 A CN201910897452 A CN 201910897452A CN 110660201 A CN110660201 A CN 110660201A
Authority
CN
China
Prior art keywords
target
recognition
recognition model
sound
audio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910897452.7A
Other languages
Chinese (zh)
Other versions
CN110660201B (en
Inventor
刘文龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jinsheng Communication Technology Co ltd
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Shanghai Jinsheng Communication Technology Co ltd
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jinsheng Communication Technology Co ltd, Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Shanghai Jinsheng Communication Technology Co ltd
Priority to CN201910897452.7A priority Critical patent/CN110660201B/en
Publication of CN110660201A publication Critical patent/CN110660201A/en
Application granted granted Critical
Publication of CN110660201B publication Critical patent/CN110660201B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B21/00Alarms responsive to a single specified undesired or abnormal condition and not otherwise provided for
    • G08B21/18Status alarms
    • G08B21/24Reminder alarms, e.g. anti-loss alarms
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/06Decision making techniques; Pattern matching strategies
    • G10L17/08Use of distortion metrics or a particular distance between probe pattern and reference templates
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/26Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Acoustics & Sound (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Business, Economics & Management (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Emergency Management (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Game Theory and Decision Science (AREA)
  • Burglar Alarm Systems (AREA)

Abstract

The embodiment of the application discloses a method, a device, a terminal and a storage medium for reminding a user of arriving at a station, and belongs to the field of artificial intelligence. The method comprises the following steps: collecting ambient sounds by a microphone while in a vehicle; identifying the environmental sound; when the environment sound is identified to contain the target alarm ring, adding one to the number of the running stations; and when the number of the running stations reaches the target number of the stations, reminding the station arrival. By adopting the method provided by the embodiment of the application, the number of the running stations is updated when the current environment sound is identified to contain the target alarm ring by acquiring the environment sound in real time and identifying whether the current environment sound contains the target alarm ring, and the station-entering reminding is carried out when the number of the running stations reaches the target number of the stations; because the alarm ring is used for giving out a warning to the passenger, the sound characteristic is obvious and the alarm ring is easy to identify, and therefore the accuracy and the effectiveness of the arrival reminding can be improved by carrying out the arrival reminding based on the alarm ring in the environment sound.

Description

到站提醒方法、装置、终端及存储介质Arrival reminder method, device, terminal and storage medium

技术领域technical field

本申请实施例涉及人工智能领域,特别涉及一种到站提醒方法、装置、终端及存储介质。The embodiments of the present application relate to the field of artificial intelligence, and in particular, to an arrival reminder method, device, terminal, and storage medium.

背景技术Background technique

人们在乘坐地铁等公共交通工具出行时,需要时刻注意当前停靠站点是否为自己的目标站点,而到站提醒功能则是一种提醒乘客在到达目标站时及时下车的功能。When people travel by public transportation such as subways, they need to always pay attention to whether the current stop is their target station, and the arrival reminder function is a function to remind passengers to get off the bus in time when they reach the target station.

相关技术中,终端通常利用语音识别技术,根据地铁播报的到站信息来获取当前站点信息,并判断当前站点是否为乘客的目标站,若当前站点为目标站,则对乘客进行到站提醒。In the related art, the terminal usually uses the speech recognition technology to obtain the current station information according to the arrival information broadcast by the subway, and judges whether the current station is the destination station of the passengers.

然而,采用上述方法获取站点信息时,乘客的说话声和地铁运行的噪音会对语音识别的结果产生较大影响,容易导致提醒延误或不准确。However, when the above method is used to obtain station information, the voice of passengers and the noise of subway operation will have a greater impact on the results of speech recognition, which may easily lead to delayed or inaccurate reminders.

发明内容SUMMARY OF THE INVENTION

本申请实施例提供了一种到站提醒方法、装置、终端及存储介质。所述技术方案如下:Embodiments of the present application provide an arrival reminder method, device, terminal, and storage medium. The technical solution is as follows:

一方面,本申请实施例提供了一种到站提醒方法,所述方法包括:On the one hand, an embodiment of the present application provides an arrival reminder method, the method includes:

当处于交通工具时,通过麦克风采集环境音;When in a vehicle, collect ambient sound through a microphone;

对所述环境音进行识别;identifying the ambient sound;

当识别出所述环境音中包含目标警铃声时,对已行驶站数进行加一操作,所述目标警铃声为开门警铃声或关门警铃声;When it is recognized that the ambient sound contains a target alarm bell, the operation of adding one to the number of stations that have been traveled is performed, and the target alarm bell is an open-door alarm bell or a door-close alarm bell;

当所述已行驶站数达到目标站数时,进行到站提醒,所述目标站数为起始站点与目标站点之间的站数,所述目标站点是中转站点或目的地站点。When the number of traveled stations reaches the target number of stations, an arrival reminder is performed, and the target station number is the number of stations between the starting station and the target station, and the target station is a transfer station or a destination station.

另一方面,本申请实施例提供了一种到站提醒装置,所述装置包括:On the other hand, an embodiment of the present application provides an arrival reminder device, and the device includes:

采集模块,用于当处于交通工具时,通过麦克风采集环境音;The acquisition module is used to collect ambient sound through a microphone when it is in a vehicle;

识别模块,用于对所述环境音进行识别;an identification module for identifying the ambient sound;

计数模块,用于当识别出所述环境音中包含目标警铃声时,对已行驶站数进行加一操作,所述目标警铃声为开门警铃声或关门警铃声;a counting module, used for adding one to the number of stations that have been traveled when it is identified that the ambient sound contains a target alarm tone, and the target alarm tone is an open-door alarm tone or a door-closed alarm tone;

提醒模块,用于当所述已行驶站数达到目标站数时,进行到站提醒,所述目标站数为起始站点与目标站点之间的站数,所述目标站点是中转站点或目的地站点。Reminder module, used to remind the arrival of the station when the number of the traveled stations reaches the target station number, the target station number is the station number between the starting station and the target station, and the target station is a transfer station or a destination local site.

另一方面,本申请实施例提供了一种终端,所述终端包括处理器和存储器;所述存储器存储有至少一条指令,所述至少一条指令用于被所述处理器执行以实现上述方面所述的到站提醒方法。On the other hand, an embodiment of the present application provides a terminal, where the terminal includes a processor and a memory; the memory stores at least one instruction, and the at least one instruction is used to be executed by the processor to implement the above aspects. The arrival reminder method described above.

另一方面,本申请实施例提供了一种计算机可读存储介质,所述存储介质存储有至少一条指令,所述至少一条指令用于被处理器执行以实现上述方面所述的到站提醒方法。On the other hand, an embodiment of the present application provides a computer-readable storage medium, where the storage medium stores at least one instruction, and the at least one instruction is configured to be executed by a processor to implement the arrival reminder method described in the above aspect .

本申请实施例提供的技术方案的有益效果至少包括:The beneficial effects of the technical solutions provided by the embodiments of the present application include at least:

本申请实施例中,通过实时采集环境音,并识别当前环境音中是否包含目标警铃声,从而在识别出包含目标警铃声时,对已行驶站数进行更新,进行在已行驶站数达到目标站数时,进行进站提醒;由于警铃声用于向乘客发出警示,声音特征较为明显,且容易被识别,因此基于环境音中的警铃声进行到站提示能够提高到站提醒的准确率和有效性。In the embodiment of the present application, by collecting ambient sounds in real time and identifying whether the current ambient sound contains the target alarm tone, when it is identified that the target alarm tone is included, the number of traveled stations is updated, and the number of traveled stations reaches the target. When the number of stops is reached, it will remind you to enter the station; because the alarm bell is used to warn passengers, the sound characteristics are more obvious and easy to be identified, so the arrival reminder based on the alarm bell in the ambient sound can improve the accuracy and efficiency of the arrival reminder. effectiveness.

附图说明Description of drawings

图1是根据一示例性实施例示出的到站提醒方法的流程图;1 is a flow chart of a method for reminding arrival at a station according to an exemplary embodiment;

图2是根据另一示例性实施例示出的到站提醒方法的流程图;2 is a flowchart of a method for reminding arrival at a station according to another exemplary embodiment;

图3是根据另一示例性实施例示出的到站提醒方法的流程图;3 is a flowchart of a method for reminding arrival at a station according to another exemplary embodiment;

图4是根据一示例性实施例示出的音频数据预处理的流程图;FIG. 4 is a flowchart of audio data preprocessing according to an exemplary embodiment;

图5是根据另一示例性实施例示出的声音识别过程的流程图;FIG. 5 is a flowchart of a voice recognition process according to another exemplary embodiment;

图6是根据另一示例性实施例示出的到站提醒方法的流程图;6 is a flowchart of a method for reminding arrival at a station according to another exemplary embodiment;

图7是根据一示例性实施例示出的一级声音识别模型进行声音识别的流程图;FIG. 7 is a flowchart of voice recognition performed by a first-level voice recognition model according to an exemplary embodiment;

图8是根据一示例性实施例示出的一种环境音的频谱图;FIG. 8 is a spectrogram of an ambient sound according to an exemplary embodiment;

图9是根据一示例性实施例示出的二级声音识别模型的框架图;9 is a frame diagram of a secondary voice recognition model shown according to an exemplary embodiment;

图10是根据一示例性实施例示出的到站提醒装置的结构框图;10 is a structural block diagram of an arrival reminder device according to an exemplary embodiment;

图11是根据一示例性实施例示出的终端的结构框图。FIG. 11 is a structural block diagram of a terminal according to an exemplary embodiment.

通过上述附图,已示出本公开明确的实施例,后文中将有更详细的描述。这些附图和文字描述并不是为了通过任何方式限制本公开构思的范围,而是通过参考特定实施例为本领域技术人员说明本公开的概念。The above-mentioned drawings have shown clear embodiments of the present disclosure, and will be described in more detail hereinafter. These drawings and written descriptions are not intended to limit the scope of the disclosed concepts in any way, but rather to illustrate the disclosed concepts to those skilled in the art by referring to specific embodiments.

具体实施方式Detailed ways

为使本发明的目的、技术方案和优点更加清楚,下面将结合附图对本发明实施方式作进一步地详细描述。In order to make the objectives, technical solutions and advantages of the present invention clearer, the embodiments of the present invention will be further described in detail below with reference to the accompanying drawings.

在本文提及的“模块”通常是指存储在存储器中的能够实现某些功能的程序或指令;在本文中提及的“单元”通常是指按照逻辑划分的功能性结构,该“单元”可以由纯硬件实现,或者,软硬件的结合实现。The "module" mentioned in this article generally refers to a program or instruction stored in a memory that can realize certain functions; the "unit" mentioned in this article generally refers to a functional structure divided by logic, the "unit" It can be realized by pure hardware, or a combination of software and hardware.

在本文中提及的“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。字符“/”一般表示前后关联对象是一种“或”的关系。As used herein, "plurality" refers to two or more. "And/or", which describes the association relationship of the associated objects, means that there can be three kinds of relationships, for example, A and/or B, which can mean that A exists alone, A and B exist at the same time, and B exists alone. The character "/" generally indicates that the associated objects are an "or" relationship.

本申请各个实施例提供的到站提醒方法用于具备音频采集和处理功能的终端,该终端可以是智能手机、平板电脑、电子书阅读器、个人便携式计算机等。在一种可能的实施方式中,本申请实施例提供的到站提醒方法可以实现成为应用程序或者应用程序的一部分,并安装在终端中。当用户乘坐交通工具时,可以手动开启该应用程序(或应用程序自动开启),从而通过应用程序,对用户进行到站提醒。The arrival reminder method provided by each embodiment of the present application is used for a terminal with audio collection and processing functions, and the terminal may be a smart phone, a tablet computer, an e-book reader, a personal portable computer, or the like. In a possible implementation manner, the arrival reminder method provided by the embodiment of the present application may be implemented as an application program or a part of the application program, and installed in the terminal. When the user takes a vehicle, the application can be manually opened (or the application can be automatically opened), so as to remind the user of the arrival of the station through the application.

相关技术中,通常利用语音识别技术,根据交通工具到站时的报站广播确定当前交通工具所在站点的站名,并在到达目标站点时对用户进行到站提醒。然而交通工具在行驶过程中产生的噪音以及乘客说话声等环境音会对语音识别造成影响,容易导致语音识别结果产生错误,并且语音识别模型很难运行在终端上,通常需要依赖云端运行。In the related art, speech recognition technology is usually used to determine the station name of the current vehicle location according to the station announcement when the vehicle arrives at the station, and to remind the user of the arrival when the vehicle arrives at the target station. However, the noise generated by the vehicle during driving and the environmental sounds such as the voice of the passengers will affect the speech recognition, which may easily lead to errors in the speech recognition results, and the speech recognition model is difficult to run on the terminal, and usually needs to rely on the cloud to run.

另外相关技术中还有利用加速度计检测交通工具是否处于加速或减速状态,从而判断交通工具是否进站,然而终端内的加速度计传感器记录的加速度方向与用户手持终端的方向有关,用户在交通工具内的走动也会对传感器的记录结果造成影响,并且交通工具有时会在两站之间临时停车,利用加速度计时难以准确判断交通工具所在位置。In addition, in the related art, the accelerometer is used to detect whether the vehicle is accelerating or decelerating, thereby judging whether the vehicle has entered the station. However, the acceleration direction recorded by the accelerometer sensor in the terminal is related to the direction in which the user holds the terminal. Walking in the interior will also affect the recording results of the sensor, and the vehicle sometimes stops temporarily between two stations, and it is difficult to accurately determine the location of the vehicle by using the accelerometer.

为了解决上述问题,本申请实施例提供了一种到站提醒方法,该到站提醒方法的流程如图1所示。终端在第一次使用到站提醒功能前,执行步骤101,存储交通工具线路图;当终端开启到站提醒功能时,首先执行步骤102,确定乘车路线;进入交通工具后,执行步骤103,通过麦克风实时获取环境音;执行步骤104,终端识别环境音中是否含有目标警铃声,当识别到环境音中不含有目标警铃声时,继续对下一段环境音进行识别,当终端识别到环境音中含有目标警铃声时,执行步骤105,对已行驶站数加一;执行步骤106,根据已行驶站数,判断是否为目的地站点,若所在站点为目的地站点,则执行步骤107,发送到站提醒,若所在站点不是目的地站点,执行步骤108,则判断是否为中转站点,确定是中转站点时,再执行步骤107,发送到站提醒,否则继续识别下一段环境音。In order to solve the above problem, an embodiment of the present application provides a method for reminding of arrival at the station, and the flow of the method for reminding of arrival at the station is shown in FIG. 1 . Before the terminal uses the arrival reminder function for the first time, step 101 is performed to store the route map of the vehicle; when the terminal starts the arrival reminder function, step 102 is first performed to determine the travel route; after entering the vehicle, step 103 is performed, Obtain the ambient sound in real time through the microphone; go to step 104, the terminal identifies whether the ambient sound contains the target alarm tone, and when it recognizes that the ambient sound does not contain the target alarm tone, it continues to identify the next segment of ambient sound, and when the terminal recognizes the ambient sound When the target alarm tone is included, go to step 105, add one to the number of stations that have been traveled; go to step 106, judge whether it is the destination station according to the number of stations that have been traveled, and if the station is the destination station, go to step 107, send Arrival reminder, if the site is not the destination site, go to step 108 to determine whether it is a transit site, and when it is determined to be a transit site, perform step 107 again to send an arrival reminder, otherwise continue to identify the next segment of ambient sound.

相较于相关技术中提供的到站提醒方法,本申请实施例通过识别当前环境音中是否含有目标警铃声来判断交通工具已行驶的站点,由于目标警铃声与其他环境音相比特征明显,受影响的因素较少,因此识别结果准确率高;并且不需要使用复杂的语音识别模型进行语音识别,有助于降低终端的功耗。Compared with the arrival reminder method provided in the related art, the embodiment of the present application judges the site where the vehicle has traveled by identifying whether the current ambient sound contains the target alarm tone. Since the target alarm tone has obvious characteristics compared with other ambient sounds, There are fewer affected factors, so the recognition result has a high accuracy rate; and there is no need to use a complex speech recognition model for speech recognition, which helps to reduce the power consumption of the terminal.

请参考图2,其示出了本申请的一个实施例示出的到站提醒方法的流程图。本实施例以到站提醒方法用于具备音频采集和处理功能的终端为例进行说明,该方法包括:Please refer to FIG. 2 , which shows a flowchart of a method for reminding arrival at a station according to an embodiment of the present application. This embodiment is described by taking the arrival reminder method for a terminal with audio collection and processing functions as an example, and the method includes:

步骤201,当处于交通工具时,通过麦克风采集环境音。Step 201, when in a vehicle, collect ambient sound through a microphone.

当处于交通工具时,终端开启到站提醒功能,并通过麦克风实时采集环境音。When in the vehicle, the terminal turns on the arrival reminder function, and collects the ambient sound in real time through the microphone.

在一种可能的实施方式中,到站提醒方法应用于地图导航类应用程序时,终端实时获取用户位置信息,当根据用户位置信息确定用户进入交通工具时,终端开启到站提醒功能。In a possible implementation, when the arrival reminder method is applied to a map navigation application, the terminal acquires user location information in real time, and when it is determined that the user enters a vehicle according to the user location information, the terminal enables the arrival reminder function.

可选的,当用户使用支付类应用程序进行刷卡乘坐交通工具时,终端确认进入交通工具,开启到站提醒功能。Optionally, when the user uses the payment application to swipe the card to take the vehicle, the terminal confirms entering the vehicle and enables the arrival reminder function.

可选的,为了降低终端的功耗,终端可使用低功耗麦克风进行实时采集。Optionally, in order to reduce the power consumption of the terminal, the terminal may use a low-power microphone for real-time collection.

步骤202,对环境音进行识别。Step 202, identifying the ambient sound.

可选的,终端将通过麦克风实时采集到的环境音转换为音频数据,并对音频数据进行数据处理,识别处理后的音频数据中是否含有目标警铃声的音频数据。Optionally, the terminal converts the ambient sound collected in real time through the microphone into audio data, performs data processing on the audio data, and identifies whether the processed audio data contains audio data of the target alarm tone.

在一种可能的实施方式中,终端在获取城市的交通工具线路图时,获取不同交通工具的警铃声,并将其音频数据保存至本地。当终端无法获取当前所在城市或交通工具的警铃声时,需要用户在第一次乘坐交通工具时开启麦克风采集警铃声并保存,以便终端对该警铃声进行学习。In a possible implementation manner, when the terminal acquires the vehicle route map of the city, it acquires the alarm bells of different vehicles, and saves its audio data locally. When the terminal cannot obtain the alarm tone of the current city or vehicle, the user needs to turn on the microphone to collect and save the alarm tone when riding the vehicle for the first time, so that the terminal can learn the alarm tone.

步骤203,当识别出环境音中包含目标警铃声时,对已行驶站数进行加一操作,目标警铃声为开门警铃声或关门警铃声。Step 203 , when it is recognized that the ambient sound contains the target alarm tone, the operation of adding one to the number of stations traveled is performed, and the target alarm tone is the door opening alarm tone or the door closing alarm tone.

终端识别出当前环境音中包含目标警铃声时,表明当前交通工具到达某一站点,则对已行驶的站数进行加一操作。由于交通工具通常在开门和关门时都会发出警铃声,为了避免计数混乱,终端可提前设置只识别开门警铃声或者只识别关门警铃声。通常开门警铃声与关门警铃声之间的时间间隔较小,因此在开门警铃声与关门警铃声相同的情况下,在固定时间区域内识别出两次警铃声时认为一次开门或一次关门。When the terminal recognizes that the target alarm tone is included in the current ambient sound, it indicates that the current vehicle has arrived at a certain station, and then adds one to the number of stations it has traveled. Since the vehicle usually emits alarm bells when opening and closing the door, in order to avoid counting confusion, the terminal can be set in advance to recognize only the door-opening alarm bell or only the door-closing alarm bell. Usually, the time interval between the door opening alarm bell and the door closing alarm bell is small, so in the case that the door opening alarm bell and the door closing alarm bell are the same, when two alarm bells are recognized within a fixed time area, it is regarded as one door opening or one door closing.

步骤204,当已行驶站数达到目标站数时,进行到站提醒,目标站数为起始站点与目标站点之间的站数,目标站点是中转站点或目的地站点。Step 204 , when the number of traveled stations reaches the target number of stations, an arrival reminder is performed, the target station number is the number of stations between the starting station and the target station, and the target station is a transfer station or a destination station.

当终端进行一次加一操作后,若当前已行驶站数达到目标站数,则表示当前站点为目标站点,对用户进行到站提醒。目标站数是起始站点与目标站点之间的站数,即交通工具从起始站点到达目标站点需要行驶的站数,目标站点包括中转站点和目的地站点。After the terminal performs an operation of adding one, if the current number of stations traveled reaches the target number, it means that the current station is the target station, and the user is reminded to arrive at the station. The number of target stations is the number of stations between the starting station and the target station, that is, the number of stations that the vehicle needs to travel from the starting station to the target station, and the target station includes the transfer station and the destination station.

可选的,为了防止终端发出到站提醒与交通工具关门驶往下一站之间的时间过短,用户错过下车时间,可以设置当到达目标站点的前一站时发送即将到站的消息提示,使用户提前做好下车准备。Optionally, in order to prevent the time between the terminal issuing the arrival reminder and the vehicle closing the door and heading to the next station from being too short, and the user missing the time to get off the bus, you can set the message to be sent when reaching the previous station of the target station. Prompt to make the user prepare to get off the car in advance.

可选的,到站提醒的方式包括但不限定与:语音提醒、震动提醒、界面提醒。Optionally, the method of arrival reminder includes but is not limited to: voice reminder, vibration reminder, and interface reminder.

关于获取目标站数的方式,在一种可能的实施方式中,终端事先加载并存储当前所在城市的交通工具的线路图,线路图中包含每条线路的站点信息、换乘信息、首末班时间及站点附近地图等。终端开启麦克风采集环境音之前,首先获取用户的乘车信息,乘车信息包括起始站点、目标站点、站点附近地图以及首末班时间等,从而根据乘车信息确定出目标站数。Regarding the method of obtaining the number of target stations, in a possible implementation, the terminal loads and stores the route map of the vehicle in the current city in advance, and the route map includes the station information, transfer information, first and last trains of each line. Time and site nearby maps, etc. Before the terminal turns on the microphone to collect ambient sound, it first obtains the user's ride information, which includes the starting station, the target station, the map near the station, the first and last shift times, etc., so as to determine the number of target stations according to the ride information.

可选的,终端获取乘车信息的方式可以是由用户手动输入,例如起始站点和目标站点的名称,终端根据用户输入的乘车信息和交通工具的线路图选择合适的乘车线路,当到达目标站点时,终端向用户发送到站提醒的消息以及目标站点附近的地图。Optionally, the way for the terminal to obtain the travel information may be manually input by the user, such as the names of the starting station and the target station, and the terminal selects the appropriate travel route according to the travel information input by the user and the route map of the vehicle. When arriving at the target site, the terminal sends a message of arrival reminder and a map near the target site to the user.

可选的,用户手动输入的乘车信息可以仅为起始站点和目标站点之间的站点数。由于本申请实施例的方法是终端根据交通工具开门或关门时的警铃声判断当前所在站点,当识别到目标警铃声时对已行驶站数进行加一操作,直至已行驶站数等于从起始站点到达目标站点所要行驶的站数,因此当用户有确定的乘车线路时,可以只输入该乘车线路的站点数,终端可提示用户当已有确定乘车线路时,输入起始站点与中转站点之间的站点数以及中转站点与目的地站点之间的站点数。Optionally, the ride information manually input by the user may only be the number of stops between the starting stop and the target stop. Because the method of the embodiment of the present application is that the terminal judges the current station according to the alarm sound when the vehicle opens or closes the door, and when the target alarm sound is recognized, the operation of adding one to the number of traveled stations is performed until the number of traveled stations is equal to the starting point The number of stations that the station needs to travel to reach the target station, so when the user has a definite bus route, he can only input the station number of the bus route, and the terminal can prompt the user to enter the starting station and the The number of stops between the staging site and the number of sites between the staging site and the destination site.

可选的,终端可以根据用户的历史乘车记录,预测用户的乘车线路,将乘车次数达到乘车次数阈值的乘车线路作为优先选择线路,并提示用户进行选择。Optionally, the terminal may predict the user's bus route according to the user's historical travel records, take the bus route whose number of trips reaches the threshold of the number of trips as the preferred route, and prompt the user to make a selection.

综上所述,本申请实施例中,通过实时采集环境音,并识别当前环境音中是否包含目标警铃声,从而在识别出包含目标警铃声时,对已行驶站数进行更新,进行在已行驶站数达到目标站数时,进行进站提醒;由于警铃声用于向乘客发出警示,声音特征较为明显,且容易被识别,因此基于环境音中的警铃声进行到站提示能够提高到站提醒的准确率和有效性。To sum up, in the embodiment of the present application, by collecting ambient sounds in real time and identifying whether the current ambient sound contains target alarm bells, when it is identified that the target alarm bells are included, the number of stations traveled is updated, and the When the number of driving stops reaches the target number of stations, a stop reminder will be given; since the alarm bells are used to warn passengers, the sound features are more obvious and easy to identify, so the arrival reminder based on the alarm bells in the ambient sound can improve the arrival of the station. The accuracy and effectiveness of reminders.

在一种可能的实施方式中,识别环境音中是否包含目标警铃声时,为了提高识别准确率,需要先将环境音对应的音频数据进行预处理,再将处理后的音频数据输入声音识别模型,从而根据声音识别模型输出的目标警铃声识别结果判断当前环境音中是否包含目标警铃声。下面采用示意性的实施例进行说明。In a possible implementation, when identifying whether the ambient sound contains the target alarm tone, in order to improve the recognition accuracy, it is necessary to preprocess the audio data corresponding to the ambient sound, and then input the processed audio data into the sound recognition model. , so as to determine whether the current ambient sound contains the target alarm tone according to the target alarm tone recognition result output by the sound recognition model. Illustrative embodiments are used for description below.

请参考图3,其示出了本申请的另一个实施例示出的到站提醒方法的流程图。本实施例以到站提醒方法用于具备音频采集和处理功能的终端为例进行说明,该方法包括:Please refer to FIG. 3 , which shows a flowchart of a method for reminding arrival at a station according to another embodiment of the present application. This embodiment is described by taking the arrival reminder method for a terminal with audio collection and processing functions as an example, and the method includes:

步骤301,当处于交通工具时,通过麦克风采集环境音。Step 301, when in a vehicle, collect ambient sound through a microphone.

步骤301的实施方式可以参考上述步骤201,本实施例在此不再赘述。For the implementation of step 301, reference may be made to the foregoing step 201, and details are not described herein again in this embodiment.

步骤302,对环境音对应的音频数据进行分帧处理,得到不同音频帧对应的音频数据。Step 302: Perform frame-by-frame processing on the audio data corresponding to the ambient sound to obtain audio data corresponding to different audio frames.

由于声音识别模型无法直接对音频数据进行识别,因此需要预先处理音频数据,得到能够被声音识别模型识别的数字特征。由于终端麦克风实时采集环境音,其音频数据整体上并不是平稳的,但其局部可以看作平稳数据,而声音识别模型只能对平稳数据进行识别,因此终端先将对应的音频数据进行分帧处理,得到不同音频帧对应的音频数据。Since the sound recognition model cannot directly recognize the audio data, it is necessary to pre-process the audio data to obtain digital features that can be recognized by the sound recognition model. Since the microphone of the terminal collects ambient sound in real time, the audio data is not stable as a whole, but its part can be regarded as stable data, and the sound recognition model can only identify the stable data, so the terminal first divides the corresponding audio data into frames process to obtain audio data corresponding to different audio frames.

在一种可能的实施方式中,音频数据预处理过程如图4所示,音频数据首先经过预加重模块401进行预加重处理,预加重过程采用高通滤波器,其只允许高于某一频率的信号分量通过,而抑制低于该频率的信号分量,从而去除音频数据中人的交谈声、脚步声和机械噪音等不必要的低频干扰,使音频信号的频谱变得平坦。高通滤波器的数学表达式为:In a possible implementation, the audio data preprocessing process is shown in FIG. 4 , the audio data is first subjected to pre-emphasis processing by the pre-emphasis module 401, and the pre-emphasis process adopts a high-pass filter, which only allows the frequency higher than a certain frequency. The signal component passes through, and the signal component lower than this frequency is suppressed, thereby removing unnecessary low-frequency interference such as human conversation, footsteps and mechanical noise in the audio data, and flattening the frequency spectrum of the audio signal. The mathematical expression for the high-pass filter is:

H(z)=1-az-1 H(z)=1-az -1

其中,a是修正系数,一般取值范围为0.95至0.97,z是音频信号。Among them, a is the correction coefficient, generally in the range of 0.95 to 0.97, and z is the audio signal.

将去除噪音后的音频数据通过分帧加窗模块402进行分帧处理,得到不同音频帧对应的音频数据。The audio data after the noise has been removed is subjected to frame-by-frame processing by the frame-by-frame windowing module 402 to obtain audio data corresponding to different audio frames.

示意性的,本实施例中将包含1024个数据点的音频数据划分为一帧,当音频数据的采样频率选取为16000Hz时,一帧音频数据的时长为64ms。为了避免两帧数据之间的变化过大,同时也为了避免加窗处理后音频帧两端的数据丢失,并不采用背靠背的方式直接将音频数据划分为帧,而是每取完一帧数据后,后滑动32ms再取下一帧数据,即相邻两帧数据重叠32ms。Illustratively, in this embodiment, audio data including 1024 data points is divided into one frame, and when the sampling frequency of the audio data is selected as 16000 Hz, the duration of one frame of audio data is 64 ms. In order to avoid excessive changes between the two frames of data, and also to avoid data loss at both ends of the audio frame after windowing, the audio data is not directly divided into frames in a back-to-back manner, but after each frame of data is taken , then slide for 32ms and then take the next frame of data, that is, two adjacent frames of data overlap for 32ms.

由于分帧处理后的音频数据在后续特征提取时需要进行离散傅里叶变换,而一帧音频数据没有明显的周期性,即帧左端和帧右端不连续,经过傅里叶变换后与原始数据会产生误差,分帧越多误差越大,因此为了使分帧后的音频数据连续,且每一帧音频数据表现出周期函数的特征,需要通过分帧加窗模块402进行分帧加窗处理。Since the audio data after frame-by-frame processing needs to undergo discrete Fourier transform in subsequent feature extraction, a frame of audio data has no obvious periodicity, that is, the left end of the frame and the right end of the frame are discontinuous, and after Fourier transform, it is the same as the original data. Errors will occur, and the more framing, the greater the error. Therefore, in order to make the audio data after framing continuous, and each frame of audio data shows the characteristics of a periodic function, it is necessary to perform framing and windowing processing by the framing windowing module 402. .

在一种可能的实施方式中,采用汉明窗对音频帧进行加窗处理。将每一帧数据乘以汉明窗函数,得到的音频数据就有了明显的周期性。汉明窗的函数形式为:In a possible implementation, a Hamming window is used to perform windowing processing on the audio frame. Multiply each frame of data by the Hamming window function, and the obtained audio data has obvious periodicity. The functional form of the Hamming window is:

Figure BDA0002210741650000071
Figure BDA0002210741650000071

其中n为整数,n的取值范围为0至M,M是傅里叶变换的点数,示意性的,本实施例取1024个数据点作为傅里叶变换点。Wherein n is an integer, the value range of n is 0 to M, and M is the number of Fourier transform points. Illustratively, in this embodiment, 1024 data points are taken as Fourier transform points.

步骤303,对各个音频帧对应的音频数据进行特征提取,得到对应的音频特征矩阵。Step 303: Perform feature extraction on the audio data corresponding to each audio frame to obtain a corresponding audio feature matrix.

音频数据在分帧加窗处理后,需要进行特征提取,得到声音识别模型能够识别的特征矩阵。After the audio data is framed and windowed, feature extraction needs to be performed to obtain a feature matrix that can be recognized by the voice recognition model.

在一种可能的实施方式中,提取音频帧的梅尔频率倒谱系数(Mel-FrequencyCepstral Coefficients,MFCC),其过程如图4所示,由于从音频信号在时域上的变换中很难得到其信号特性,通常需要把时域信号转换为频域上的能量分布来处理,因此终端先将音频帧数据输入傅里叶变换模块403进行傅里叶变换,然后将傅里叶变换后的音频帧数据输入能量谱计算模块404,计算音频帧数据的能量谱。为了将其能量谱转化为符合人耳听觉的梅尔谱,需要将能量谱输入梅尔滤波处理模块进行滤波处理,;滤波处理的数学表达式为:In a possible implementation, the Mel-Frequency Cepstral Coefficients (MFCC) of the audio frame is extracted, and the process is shown in FIG. 4 , since it is difficult to obtain from the audio signal transformation in the time domain Its signal characteristics usually need to be processed by converting the time-domain signal into the energy distribution in the frequency domain. Therefore, the terminal first inputs the audio frame data into the Fourier transform module 403 for Fourier transform, and then converts the Fourier-transformed audio The frame data is input to the energy spectrum calculation module 404 to calculate the energy spectrum of the audio frame data. In order to convert its energy spectrum into a Mel spectrum conforming to human hearing, the energy spectrum needs to be input into the Mel filtering processing module for filtering processing; the mathematical expression of filtering processing is:

Figure BDA0002210741650000081
Figure BDA0002210741650000081

其中,f为傅里叶变换后的频点。Among them, f is the frequency point after Fourier transform.

得到音频帧的梅尔谱之后,终端通过离散余弦变换(Discrete CosineTransform,DCT)模块406对其取对数,得到的DCT系数即为MFCC特征。After obtaining the mel spectrum of the audio frame, the terminal takes the logarithm of it through the discrete cosine transform (Discrete Cosine Transform, DCT) module 406, and the obtained DCT coefficient is the MFCC feature.

示意性的,本申请实施例选取40维的MFCC特征,终端在实际提取特征时,音频数据的输入窗口长度选为1056ms,而一帧信号的时间长度为64ms,相邻两帧数据之间有32ms的重叠部分,因此每一个1056ms的输入窗口数据对应生成的特征为32*40的矩阵。Illustratively, the embodiment of the present application selects 40-dimensional MFCC features. When the terminal actually extracts features, the input window length of audio data is selected as 1056ms, and the time length of one frame of signal is 64ms. The overlapping part of 32ms, so each 1056ms input window data corresponds to a matrix of 32*40 features.

步骤304,将音频特征矩阵输入声音识别模型,得到声音识别模型输出的目标警铃声识别结果,目标警铃声识别结果用于指示音频帧中是否包含目标警铃声。Step 304: Input the audio feature matrix into the sound recognition model to obtain the target alarm tone recognition result output by the voice recognition model, and the target alarm tone recognition result is used to indicate whether the audio frame contains the target alarm tone.

可选的,终端将音频帧经过特征提取后得到的音频特征矩阵输入声音识别模型中,模型识别当前音频帧中是否包含目标警铃声,并输出识别结果。Optionally, the terminal inputs the audio feature matrix obtained by the feature extraction of the audio frame into the sound recognition model, and the model recognizes whether the current audio frame contains the target alarm tone, and outputs the recognition result.

在一种可能的实施方式中,若终端无法自主获取当前所在城市的交通工具的警铃声时,需要用户事先采集目标警铃声,当采集到目标警铃声后,将包含目标警铃声的音频数据也进行上述步骤302至303的分帧处理和特征提取过程,并将不同目标警铃声的音频特征矩阵保存至本地。In a possible implementation, if the terminal cannot autonomously acquire the alarm tone of the vehicle in the current city, the user needs to collect the target alarm tone in advance. After the target alarm tone is collected, the audio data containing the target alarm tone is also Carry out the frame division processing and feature extraction process of the above steps 302 to 303, and save the audio feature matrix of different target alarm tones to the local.

步骤305,当预定时长内包含目标警铃声的音频帧的个数达到个数阈值时,确定环境音中包含目标警铃声。Step 305 , when the number of audio frames containing the target alarm tone within the predetermined duration reaches the number threshold, it is determined that the ambient sound contains the target alarm tone.

由于终端在进行识别目标警铃声之前,将音频数据进行了分帧处理,而一帧音频的时间很短,因此当某一音频帧中包含目标警铃声时,无法排除存在其他相似声音或特征提取时的数据处理过程产生错误的情况,不能立即确定环境音中包含目标警铃声。所以,终端设置预定时长,当声音识别模型的输出结果指示预定时长内包含目标警铃声的音频帧的个数达到个数阈值时,确定环境音中包含目标警铃声。Since the terminal processes the audio data into frames before identifying the target alarm tone, and the time of one frame of audio is very short, when an audio frame contains the target alarm tone, it cannot be ruled out that there are other similar sounds or feature extraction. In the event of an error in the data processing process at the time, it cannot be immediately determined that the ambient sound contains the target alarm tone. Therefore, the terminal sets a predetermined duration, and when the output result of the voice recognition model indicates that the number of audio frames containing the target alarm tone within the predetermined duration reaches the threshold, it is determined that the ambient sound contains the target alarm tone.

示意性的,终端设置预定时长为5秒,个数阈值为2,当5秒钟的时间内终端识别到2个或多于2个音频帧中包含目标警铃声时,确定当前环境音中包含目标警铃声。Illustratively, the terminal sets the predetermined duration to 5 seconds and the number threshold to 2. When the terminal recognizes that the target alarm tone is contained in 2 or more than 2 audio frames within 5 seconds, it is determined that the current ambient sound contains the target alarm tone. Target alarm bell.

步骤306,获取上一警铃识别时刻,上一警铃识别时刻为上一次识别出环境音中包含目标警铃声的时刻。Step 306 , acquiring the last alarm bell recognition time, where the last alarm bell recognition time is the last time when the target alarm tone was recognized in the ambient sound.

当声音识别模型的输出结果中,指示预定时长内包含目标警铃声的音频帧个数达到个数阈值时,终端记录当前时刻,并获取上一次识别出环境音中包含目标警铃声的时刻,即获取上一警铃识别时刻。When the output result of the sound recognition model indicates that the number of audio frames containing the target alarm tone within the predetermined duration reaches the threshold, the terminal records the current moment and obtains the last time that the ambient sound contains the target alarm tone, that is, Get the last alarm bell recognition time.

步骤307,若上一警铃识别时刻与当前警铃识别时刻之间的时间间隔大于时间间隔阈值,则对已行驶站数进行加一操作。Step 307 , if the time interval between the last alarm bell recognition time and the current alarm bell recognition time is greater than the time interval threshold, the operation of adding one to the number of traveled stations is performed.

在实际乘车过程中,交通工具的关门警铃声和开门警铃声可能相同,会导致终端在同一站点识别到两次警铃声,或者,同一种交通工具的其他车辆与终端所在车辆的警铃声相同,当终端所在车辆停靠在某一站点时附近车辆发出相同的警铃声,都会导致终端计数产生错误,因此,终端预先设置时间间隔阈值,若上一警铃识别时刻与当前警铃识别时刻之间的时间间隔大于时间间隔阈值,则对已行驶站数进行加一操作。During the actual ride, the closing and opening alarms of the vehicle may be the same, which will cause the terminal to recognize two alarms at the same station, or other vehicles of the same vehicle may have the same alarm tones as the vehicle where the terminal is located. , when the vehicle where the terminal is parked at a certain station emits the same alarm tone, the terminal count will be wrong. Therefore, the terminal presets the time interval threshold. If the time between the last alarm bell recognition time and the current alarm bell recognition time is If the time interval is greater than the time interval threshold, add one to the number of traveled stations.

示意性的,预先设置时间间隔阈值为1分钟,终端每一次终端识别出环境音中包含目标警铃声时,记录当前时刻并获取上一警铃识别时刻,若二者之间的时间间隔大于一分钟,则确定交通工具行驶了一站,并对已行驶站数进行加一操作。Illustratively, the preset time interval threshold is 1 minute, and each time the terminal recognizes that the ambient sound contains the target alarm tone, it records the current time and obtains the last alarm recognition time, if the time interval between the two is greater than one minutes, it is determined that the vehicle has traveled one stop, and the operation of adding one to the number of traveled stops is performed.

步骤308,当已行驶站数达到目标站数时,进行到站提醒,目标站数为起始站点与目标站点之间的站数,目标站点是中转站点或目的地站点。Step 308 , when the number of traveled stations reaches the target number of stations, an arrival reminder is performed, the target station number is the number of stations between the starting station and the target station, and the target station is a transfer station or a destination station.

步骤308的实施方式可以参考上述步骤204,本实施例在此不再赘述。For the implementation of step 308, reference may be made to the foregoing step 204, and details are not described herein again in this embodiment.

本申请实施例中,通过对环境音的音频数据进行分帧和加窗处理,并对音频帧进行特征提取,得到声音识别模型能够识别的数据;通过对声音识别模型的输出结果进行后处理,确认识别出的警铃声是否为目标警铃声,避免将其他交通工具的警铃声或类似声音误识别为目标警铃声,提高了到站提醒的准确率。In the embodiment of the present application, the audio data of the ambient sound is framed and windowed, and the audio frame is feature extracted to obtain data that can be recognized by the voice recognition model; by post-processing the output results of the voice recognition model, Confirm whether the identified alarm tone is the target alarm tone, avoid misidentifying the alarm tone or similar sounds of other vehicles as the target alarm tone, and improve the accuracy of the arrival reminder.

由于终端在交通工具行驶过程中需要一直开启麦克风获取环境音,并将环境音的音频数据输入声音识别模型进行识别,因此声音识别模型一直处于工作状态。为了减少终端的功耗,在一种可能的实施方式中,终端采用简单的相似度模型作为一级声音识别模型进行识别,同时为了提高声音识别模型的精度,采用卷积神经网络(Convolutional NeuralNetworks,CNN)模型作为二级声音识别模型。声音识别过程如图5所示,终端执行步骤501,输入环境音,在对环境音进行识别之前,通过步骤502提取音频数据特征,然后将提取出的音频特征矩阵输入一级声音识别模型,通过步骤503判断相似度,当相似度判断结果不能确定环境音中是否包含目标警铃声时,将音频特征矩阵输入二级声音识别模型,通过步骤504CNN分类,若CNN模型的识别结果为环境音中包含目标警铃声,则执行步骤505后处理和步骤506判断是否加一操作,若识别结果为环境音中不包含目标警铃声,则终端继续对下一帧音频数据进行识别。Since the terminal needs to always turn on the microphone to obtain ambient sound while the vehicle is running, and input the audio data of the ambient sound into the sound recognition model for recognition, the sound recognition model is always in a working state. In order to reduce the power consumption of the terminal, in a possible implementation manner, the terminal uses a simple similarity model as a first-level voice recognition model for recognition, and at the same time, in order to improve the accuracy of the voice recognition model, a convolutional neural network (Convolutional Neural Networks, CNN) model as a secondary sound recognition model. The sound recognition process is shown in Figure 5. The terminal executes step 501 to input environmental sounds. Before identifying the environmental sounds, it extracts audio data features through step 502, and then inputs the extracted audio feature matrix into a first-level sound recognition model. Step 503 judges the similarity, when the similarity judgment result cannot determine whether the ambient sound contains the target alarm tone, the audio feature matrix is input into the secondary sound recognition model, and is classified by the CNN in step 504. If the recognition result of the CNN model is that the ambient sound contains If the target alarm tone is present, the post-processing of step 505 and step 506 are performed to determine whether to add an operation. If the identification result is that the ambient sound does not contain the target alarm tone, the terminal continues to identify the next frame of audio data.

在一种可能的实施方式中,在图3的基础上,如图6所示,上述步骤304包括步骤304a至304d。In a possible implementation manner, on the basis of FIG. 3 , as shown in FIG. 6 , the above step 304 includes steps 304a to 304d.

步骤304a,将音频特征矩阵输入一级声音识别模型,得到一级声音识别模型输出的第一识别结果。Step 304a: Input the audio feature matrix into the first-level sound recognition model to obtain the first recognition result output by the first-level sound recognition model.

在一种可能的实施方式中,一级声音识别模型采用余弦相似度模型,模型建立与识别过程如图7所示,上方的分支为模型建立过程,首先通过步骤701对采集到的目标警铃声的音频数据进行音频特征提取(具体过程可参考步骤302至303),进一步的,通过步骤702对提取到的音频特征取平均值,再将得到的每一种目标警铃声的32*40的音频特征矩阵转换为含有1280个元素的一维特征向量,加入到数据库中,最终通过步骤703生成特征数据库。模型建立完成后,具体识别过程如图7下方分支所示,输入环境音音频数据后,通过步骤704进行音频特征提取,再通过步骤705,根据提取到的音频特征生成特征向量(将音频帧对应的音频特征矩阵转换为含有1280个元素的一维特征向量),最后通过步骤706遍历特征数据库中的一维特征向量Yi,计算其与当前音频帧的一维特征向量X的余弦相似度,计算公式为:In a possible implementation, the first-level sound recognition model adopts a cosine similarity model, and the model establishment and recognition process is shown in FIG. 7 , and the upper branch is the model establishment process. Extract the audio features of the audio data obtained (for the specific process, refer to steps 302 to 303), and further, take the average value of the extracted audio features through step 702, and then obtain the 32*40 audio of each target alarm tone. The feature matrix is converted into a one-dimensional feature vector with 1280 elements, added to the database, and finally a feature database is generated through step 703 . After the model is established, the specific identification process is shown in the lower branch of Figure 7. After inputting the ambient sound audio data, the audio feature extraction is carried out through step 704, and then through step 705, a feature vector (corresponding to the audio frame) is generated according to the extracted audio features. The audio feature matrix is converted into a one-dimensional feature vector containing 1280 elements), and finally traverses the one-dimensional feature vector Y i in the feature database through step 706, and calculates its cosine similarity with the one-dimensional feature vector X of the current audio frame, The calculation formula is:

Figure BDA0002210741650000111
Figure BDA0002210741650000111

其中,n表示目标警铃声的种类数,即特征数据库中所有一维特征向量的总数。Among them, n represents the number of types of target alarm tone, that is, the total number of all one-dimensional feature vectors in the feature database.

在其他可能的实施方式中,一级声音识别模型可以使用其他的相似度模型,例如曼哈顿距离模型、马氏距离模型和欧式距离模型等。本申请实施例仅以余弦相似度模型为例进行示意性说明。In other possible implementations, the first-level voice recognition model may use other similarity models, such as Manhattan distance model, Mahalanobis distance model, Euclidean distance model, and so on. The embodiments of the present application only take the cosine similarity model as an example for schematic illustration.

终端得到一级声音识别模型对音频特征矩阵的第一识别结果后,并不能直接确定当前环境音中是否含有目标警铃声,需要判断一级声音识别模型的第一识别结果是否符合输出条件。After the terminal obtains the first recognition result of the audio feature matrix by the first-level sound recognition model, it cannot directly determine whether the current ambient sound contains the target alarm tone, and needs to determine whether the first recognition result of the first-level sound recognition model meets the output conditions.

在一种可能的实施方式中,步骤304a包括下述步骤一至二。In a possible implementation, step 304a includes the following steps one to two.

一、若第一识别结果指示相似度大于第一相似度阈值或小于第二相似度阈值,则确定第一识别结果符合输出条件,第一相似度阈值大于第二相似度阈值。1. If the first identification result indicates that the similarity is greater than the first similarity threshold or smaller than the second similarity threshold, it is determined that the first identification result meets the output condition, and the first similarity threshold is greater than the second similarity threshold.

在一种可能的实施方式中,当第一识别结果指示相似度大于或者等于第一相似度阈值时,终端确定当前环境音中含有目标警铃声;当第一识别结果指示相似度小于第二相似度阈值时,终端确定当前环境音中没有目标警铃声,第一相似度阈值大于第二相似度阈值。上述两种结果符合输出条件。In a possible implementation, when the first recognition result indicates that the similarity is greater than or equal to the first similarity threshold, the terminal determines that the current ambient sound contains the target alarm tone; when the first recognition result indicates that the similarity is less than the second similarity When the degree threshold is reached, the terminal determines that there is no target alarm tone in the current ambient sound, and the first similarity threshold is greater than the second similarity threshold. The above two results meet the output conditions.

示意性的,设置第一相似度阈值为0.9,第二相似度阈值为0.5,当第一识别结果指示相似度大于或者等于0.9,或小于0.5时,确定第一识别结果符合输出条件Illustratively, the first similarity threshold is set to 0.9, and the second similarity threshold is 0.5. When the first recognition result indicates that the similarity is greater than or equal to 0.9, or less than 0.5, it is determined that the first recognition result meets the output condition.

二、若第一识别结果指示相似度大于第二相似度阈值且小于第一相似度阈值,则确定第一识别结果不符合输出条件。2. If the first identification result indicates that the similarity is greater than the second similarity threshold and smaller than the first similarity threshold, it is determined that the first identification result does not meet the output condition.

当第一识别结果指示相似度大于第二相似度阈值且小于第一相似度阈值时,一级声音识别模型不能确定环境音中是否包含目标警铃声,第一识别结果不符合输出条件,此时需要二级声音识别模型对当前音频帧的音频特征矩阵进一步识别。When the first recognition result indicates that the similarity is greater than the second similarity threshold and less than the first similarity threshold, the first-level sound recognition model cannot determine whether the ambient sound contains the target alarm tone, and the first recognition result does not meet the output conditions. A secondary sound recognition model is required to further identify the audio feature matrix of the current audio frame.

示意性的,设置第一相似度阈值为0.9,第二相似度阈值为0.5,当第一识别结果小于0.9且大于0.5时,确定第一识别结果不符合输出条件。Illustratively, the first similarity threshold is set to 0.9, and the second similarity threshold is set to 0.5. When the first recognition result is less than 0.9 and greater than 0.5, it is determined that the first recognition result does not meet the output condition.

步骤304b,若第一识别结果符合输出条件,则将第一识别结果作为目标警铃声识别结果并输出。Step 304b, if the first recognition result meets the output condition, the first recognition result is used as the target alarm tone recognition result and output.

当第一识别结果符合输出条件时,表明一级声音识别模型能够确定当前环境音中是否含有目标警铃声,因此终端将第一识别结果作为目标警铃声识别结果并输出。When the first recognition result meets the output conditions, it indicates that the first-level sound recognition model can determine whether the current ambient sound contains the target alarm tone, so the terminal outputs the first recognition result as the target alarm tone recognition result.

示意性的,当第一识别结果的余弦相似度为0.95时,确定当前环境音中含有目标警铃声,输出第一识别结果0.95。Illustratively, when the cosine similarity of the first identification result is 0.95, it is determined that the current ambient sound contains the target alarm tone, and the first identification result of 0.95 is output.

步骤304c,若第一识别结果不符合输出条件,则将音频特征矩阵输入二级声音识别模型,得到二级声音识别模型输出的第二识别结果。Step 304c, if the first recognition result does not meet the output conditions, input the audio feature matrix into the secondary voice recognition model to obtain the second recognition result output by the secondary voice recognition model.

当第一识别结果不符合输出条件时,表明一级声音识别模型不能确定当前环境音中是否含有目标警铃声,此时需要二级声音识别模型进一步识别,将第一识别结果不符合输出条件的音频帧对应的音频特征矩阵输入二级声音识别模型,得到二级声音识别模型输出的第二识别结果。When the first recognition result does not meet the output conditions, it indicates that the first-level sound recognition model cannot determine whether the current ambient sound contains the target alarm tone. At this time, the second-level sound recognition model needs to be further identified, and the first recognition result does not meet the output conditions. The audio feature matrix corresponding to the audio frame is input to the secondary sound recognition model, and the second recognition result output by the secondary sound recognition model is obtained.

在一种可能的实施方式中,二级声音识别模型采用CNN分类模型,模型训练过程如下:In a possible implementation, the secondary voice recognition model adopts a CNN classification model, and the model training process is as follows:

一、将采集到的包含目标警铃声的环境音转换为频谱图。1. Convert the collected ambient sound containing the target alarm tone into a spectrogram.

如图8所示,可以明显看出目标警铃声与其他环境音的区别,图中黑色方框内的短线为目标警铃声的频谱,标记出目标警铃声作为正样本,其余环境音作为负样本。As shown in Figure 8, the difference between the target alarm tone and other ambient sounds can be clearly seen. The short line in the black box in the figure is the spectrum of the target alarm tone. The target alarm tone is marked as a positive sample, and the rest of the ambient sounds are used as negative samples. .

二、对采集到的环境音进行特征提取。2. Perform feature extraction on the collected ambient sound.

事先采集的环境音的特征提取方式与上述实施例中的特征提取方式相同,每个音频帧对应的音频特征矩阵作为一条训练样本,目标警铃声的音频特征矩阵对应的标签为0,其余环境音的音频特征矩阵对应的标签为1。The feature extraction method of the environmental sound collected in advance is the same as the feature extraction method in the above-mentioned embodiment, the audio feature matrix corresponding to each audio frame is used as a training sample, the label corresponding to the audio feature matrix of the target alarm tone is 0, and the other environmental sounds The corresponding label of the audio feature matrix is 1.

三、构建CNN模型。Third, build a CNN model.

在一种可能的实施方式中,CNN模型结构如图9所示,第一卷积层901和第二卷积层902用于提取输入的音频特征矩阵的特征,第一全连接层903和第二全连接层904整合卷积层901和902中具有类别区分性的信息,最后接归一化指数函数(Softmax)905,将全连接层整合的信息进行分类,得到第二识别结果。In a possible implementation manner, the structure of the CNN model is shown in FIG. 9 , the first convolutional layer 901 and the second convolutional layer 902 are used to extract the features of the input audio feature matrix, the first fully connected layer 903 and the first convolutional layer 902 The second fully-connected layer 904 integrates the category-distinguishing information in the convolutional layers 901 and 902, and finally connects to a normalized exponential function (Softmax) 905 to classify the information integrated by the fully-connected layer to obtain a second identification result.

四、构建模型的损失函数。Fourth, build the loss function of the model.

由于交通工具行驶时,目标警铃声通常只有5秒左右,而其余环境音长达几分钟,正负样本数据非常不平衡,因此选取焦点损失函数(Focal loss)解决样本不均衡的问题,Focal loss公式如下:When the vehicle is driving, the target alarm tone is usually only about 5 seconds, while the rest of the ambient sound lasts for several minutes, and the positive and negative sample data are very unbalanced. Therefore, the focus loss function (Focal loss) is selected to solve the problem of unbalanced samples. Focal loss The formula is as follows:

Figure BDA0002210741650000131
Figure BDA0002210741650000131

其中,y′为CNN分类模型输出的概率,y为训练样本对应的标签,α和γ为手动调节参数,用于调整正负样本的比例。Among them, y′ is the probability output by the CNN classification model, y is the label corresponding to the training sample, and α and γ are manual adjustment parameters used to adjust the ratio of positive and negative samples.

五、导入训练样本进行模型训练。5. Import training samples for model training.

在一种可能的实施方式中,可以利用Tensorflow系统训练CNN分类模型,并采用Focal loss和梯度下降算法,直至模型收敛。In a possible implementation, a CNN classification model can be trained by using the Tensorflow system, and Focal loss and gradient descent algorithms can be used until the model converges.

在一种可能的实施方式中,二级声音识别模型可以也可以采用其他的传统机器学习分类器或深度学习分类模型,本实施例对此不做限定。In a possible implementation manner, the secondary voice recognition model may also use other traditional machine learning classifiers or deep learning classification models, which is not limited in this embodiment.

步骤304d,将第二识别结果作为目标警铃声识别结果并输出。Step 304d, taking the second recognition result as the target alarm tone recognition result and outputting it.

二级声音识别模型为精度较高的CNN分类模型,因此,得到CNN分类模型的第二识别结果后,将第二识别结果作为目标警铃声识别结果并输出。The secondary sound recognition model is a CNN classification model with high accuracy. Therefore, after obtaining the second recognition result of the CNN classification model, the second recognition result is used as the target alarm tone recognition result and output.

本申请实施例中,采用两种声音识别模型,一级声音识别模型功耗低且容易实现,实时开启并对环境音的音频数据进行识别,二级声音识别模型精度高但功耗高,当一级声音识别模型无法确定当前环境音中是否包含目标警铃声时,再开启二级声音识别模型对当前环境音进行识别,在提高模型识别结果准确率的同时降低了终端的功耗。In the embodiment of the present application, two kinds of sound recognition models are used. The first-level sound recognition model has low power consumption and is easy to implement. It is enabled in real time and recognizes the audio data of ambient sounds. The second-level sound recognition model has high accuracy but high power consumption. When the first-level sound recognition model cannot determine whether the current environmental sound contains the target alarm tone, the second-level sound recognition model is turned on to recognize the current environmental sound, which improves the accuracy of the model recognition results and reduces the power consumption of the terminal.

请参考图10,其示出了本申请一个示例性实施例提供的到站提醒装置的结构框图。该装置可以通过软件、硬件或者两者的结合实现成为终端的全部或一部分。该装置包括:Please refer to FIG. 10 , which shows a structural block diagram of an arrival reminder device provided by an exemplary embodiment of the present application. The apparatus can be implemented as all or a part of the terminal through software, hardware or a combination of the two. The device includes:

采集模块1001,用于当处于交通工具时,通过麦克风采集环境音;The collection module 1001 is used to collect ambient sound through a microphone when it is in a vehicle;

识别模块1002,用于对所述环境音进行识别;an identification module 1002, configured to identify the ambient sound;

计数模块1003,用于当识别出所述环境音中包含目标警铃声时,对已行驶站数进行加一操作,所述目标警铃声为开门警铃声或关门警铃声;The counting module 1003 is used for adding one to the number of stations that have been traveled when it is identified that the ambient sound contains a target alarm tone, and the target alarm tone is an open-door alarm tone or a door-close alarm tone;

提醒模块1004,用于当所述已行驶站数达到目标站数时,进行到站提醒,所述目标站数为起始站点与目标站点之间的站数,所述目标站点是中转站点或目的地站点。Reminder module 1004, for when the number of stations that have been traveled reaches the target number of stations, a reminder of arrival is performed, and the number of target stations is the number of stations between the starting station and the target station, and the target station is a transfer station or destination site.

可选的,所述识别模块1002,包括:Optionally, the identification module 1002 includes:

分帧处理单元,用于对所述环境音对应的音频数据进行分帧处理,得到不同音频帧对应的音频数据;a frame-by-frame processing unit, configured to perform frame-by-frame processing on the audio data corresponding to the ambient sound to obtain audio data corresponding to different audio frames;

特征提取单元,用于对各个音频帧对应的音频数据进行特征提取,得到对应的音频特征矩阵;a feature extraction unit, configured to perform feature extraction on the audio data corresponding to each audio frame to obtain a corresponding audio feature matrix;

声音识别单元,用于将所述音频特征矩阵输入声音识别模型,得到所述声音识别模型输出的目标警铃声识别结果,所述目标警铃声识别结果用于指示所述音频帧中是否包含所述目标警铃声。The sound recognition unit is used for inputting the audio feature matrix into a sound recognition model to obtain a target alarm tone recognition result output by the voice recognition model, and the target alarm tone recognition result is used to indicate whether the audio frame contains the Target alarm bell.

可选的,所述声音识别模型包括一级声音识别模型和二级声音识别模型,所述二级声音识别模型的识别准确度高于所述一级声音识别模型的识别准确度;Optionally, the voice recognition model includes a primary voice recognition model and a secondary voice recognition model, and the recognition accuracy of the secondary voice recognition model is higher than the recognition accuracy of the primary voice recognition model;

所述声音识别单元,还用于:The voice recognition unit is also used for:

将所述音频特征矩阵输入所述一级声音识别模型,得到所述一级声音识别模型输出的第一识别结果;Inputting the audio feature matrix into the first-level sound recognition model to obtain the first recognition result output by the first-level sound recognition model;

若所述第一识别结果符合输出条件,则将所述第一识别结果作为所述目标警铃声识别结果并输出;If the first recognition result meets the output condition, the first recognition result is used as the target alarm tone recognition result and output;

若所述第一识别结果不符合所述输出条件,则将所述音频特征矩阵输入所述二级声音识别模型,得到所述二级声音识别模型输出的第二识别结果;If the first recognition result does not meet the output condition, then input the audio feature matrix into the secondary voice recognition model to obtain a second recognition result output by the secondary voice recognition model;

将所述第二识别结果作为所述目标警铃声识别结果并输出。The second recognition result is used as the target alarm tone recognition result and output.

可选的,所述一级声音识别模型用于计算所述音频特征矩阵与特征数据库中的样本特征矩阵之间的相似度,所述样本特征矩阵为所述目标警铃声的音频特征矩阵;Optionally, the first-level sound recognition model is used to calculate the similarity between the audio feature matrix and the sample feature matrix in the feature database, and the sample feature matrix is the audio feature matrix of the target alarm tone;

所述声音识别单元,还用于:The voice recognition unit is also used for:

若所述第一识别结果指示所述相似度大于第一相似度阈值或小于第二相似度阈值,则确定所述第一识别结果符合所述输出条件,所述第一相似度阈值大于所述第二相似度阈值;If the first identification result indicates that the similarity is greater than a first similarity threshold or smaller than a second similarity threshold, it is determined that the first identification result meets the output condition, and the first similarity threshold is greater than the the second similarity threshold;

若所述第一识别结果指示所述相似度大于所述第二相似度阈值且小于所述第一相似度阈值,则确定所述第一识别结果不符合所述输出条件。If the first identification result indicates that the similarity is greater than the second similarity threshold and smaller than the first similarity threshold, it is determined that the first identification result does not meet the output condition.

可选的,所述识别模块1002还包括:Optionally, the identification module 1002 further includes:

确定单元,用于当预定时长内包含所述目标警铃声的音频帧的个数达到个数阈值时,确定所述环境音中包含所述目标警铃声。A determining unit, configured to determine that the ambient sound contains the target alarm tone when the number of audio frames containing the target alarm tone within a predetermined duration reaches a threshold number.

可选的,所述计数模块1003包括:Optionally, the counting module 1003 includes:

获取单元,用于获取上一警铃识别时刻,所述上一警铃识别时刻为上一次识别出所述环境音中包含所述目标警铃声的时刻;an acquisition unit, configured to acquire the last alarm bell recognition time, where the last alarm bell recognition time is the time when the target alarm bell was identified in the ambient sound last time;

比较单元,用于若所述上一警铃识别时刻与当前警铃识别时刻之间的时间间隔大于时间间隔阈值,则对所述已行驶站数进行加一操作。A comparison unit, configured to perform an operation of adding one to the number of traveled stations if the time interval between the last alarm bell recognition time and the current alarm bell recognition time is greater than a time interval threshold.

本实施例中,采用实施例提供的到站提醒装置,通过实时采集环境音,并识别当前环境音中是否包含目标警铃声,从而在识别出包含目标警铃声时,对已行驶站数进行更新,进行在已行驶站数达到目标站数时,进行进站提醒;由于警铃声用于向乘客发出警示,声音特征较为明显,且容易被识别,因此基于环境音中的警铃声进行到站提示能够提高到站提醒的准确率和有效性。In this embodiment, the arrival reminder device provided by the embodiment is used to collect ambient sounds in real time and identify whether the current ambient sound contains target alarm bells, so that when it is identified that the target alarm bells are included, the number of traveled stops is updated. , when the number of traveled stations reaches the target number of stations, a reminder for entering the station is carried out; since the alarm bell is used to warn passengers, the sound characteristics are more obvious and easy to be recognized, so the arrival reminder is based on the alarm bell in the ambient sound. It can improve the accuracy and effectiveness of the arrival reminder.

请参考图11,其示出了本申请一个示例性实施例提供的终端1100的结构方框图。该终端1100可以是智能手机、平板电脑、电子书、便携式个人计算机等安装并运行有应用程序的电子设备。本申请中的终端1100可以包括一个或多个如下部件:处理器1110、存储器1120和屏幕1130。Please refer to FIG. 11 , which shows a structural block diagram of a terminal 1100 provided by an exemplary embodiment of the present application. The terminal 1100 may be an electronic device, such as a smart phone, a tablet computer, an electronic book, a portable personal computer, etc., on which application programs are installed and run. The terminal 1100 in this application may include one or more of the following components: a processor 1110 , a memory 1120 and a screen 1130 .

处理器1110可以包括一个或者多个处理核心。处理器1110利用各种接口和线路连接整个终端1100内的各个部分,通过运行或执行存储在存储器1120内的指令、程序、代码集或指令集,以及调用存储在存储器1120内的数据,执行终端1100的各种功能和处理数据。可选地,处理器1110可以采用数字信号处理(Digital Signal Processing,DSP)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)、可编程逻辑阵列(Programmable LogicArray,PLA)中的至少一种硬件形式来实现。处理器1110可集成中央处理器(CentralProcessing Unit,CPU)、图像处理器(Graphics Processing Unit,GPU)和调制解调器等中的一种或几种的组合。其中,CPU主要处理操作系统、用户界面和应用程序等;GPU用于负责屏幕1130所需要显示的内容的渲染和绘制;调制解调器用于处理无线通信。可以理解的是,上述调制解调器也可以不集成到处理器1110中,单独通过一块通信芯片进行实现。The processor 1110 may include one or more processing cores. The processor 1110 uses various interfaces and lines to connect various parts of the entire terminal 1100, and executes the terminal by running or executing the instructions, programs, code sets or instruction sets stored in the memory 1120, and calling the data stored in the memory 1120. 1100's various functions and processing data. Optionally, the processor 1110 may employ at least one of a digital signal processing (Digital Signal Processing, DSP), a Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA), and a Programmable Logic Array (Programmable Logic Array, PLA). implemented in hardware. The processor 1110 may integrate one or a combination of a central processing unit (Central Processing Unit, CPU), a graphics processing unit (Graphics Processing Unit, GPU), a modem, and the like. Among them, the CPU mainly handles the operating system, user interface and application programs, etc.; the GPU is used for rendering and drawing the content that needs to be displayed on the screen 1130; the modem is used for processing wireless communication. It can be understood that, the above-mentioned modem may not be integrated into the processor 1110, and is implemented by a communication chip alone.

存储器1120可以包括随机存储器(Random Access Memory,RAM),也可以包括只读存储器(Read-Only Memory)。可选地,该存储器1120包括非瞬时性计算机可读介质(non-transitory computer-readable storage medium)。存储器1120可用于存储指令、程序、代码、代码集或指令集。存储器1120可包括存储程序区和存储数据区,其中,存储程序区可存储用于实现操作系统的指令、用于实现至少一个功能的指令(比如触控功能、声音播放功能、图像播放功能等)、用于实现上述各个方法实施例的指令等,该操作系统可以是安卓(Android)系统(包括基于Android系统深度开发的系统)、苹果公司开发的IOS系统(包括基于IOS系统深度开发的系统)或其它系统。存储数据区还可以存储终端1100在使用中所创建的数据(比如电话本、音视频数据、聊天记录数据)等。The memory 1120 may include random access memory (Random Access Memory, RAM), or may include read-only memory (Read-Only Memory). Optionally, the memory 1120 includes a non-transitory computer-readable storage medium. Memory 1120 may be used to store instructions, programs, codes, sets of codes, or sets of instructions. The memory 1120 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing the operating system, instructions for implementing at least one function (such as a touch function, a sound playback function, an image playback function, etc.) , instructions for implementing the above method embodiments, etc., the operating system can be an Android (Android) system (including a system based on the deep development of the Android system), an IOS system developed by Apple (including a system based on the deep development of the IOS system) or other systems. The storage data area may also store data created by the terminal 1100 during use (such as phone book, audio and video data, chat record data) and the like.

屏幕1130可以为电容式触摸显示屏,该电容式触摸显示屏用于接收用户使用手指、触摸笔等任何适合的物体在其上或附近的触摸操作,以及显示各个应用程序的用户界面。触摸显示屏通常设置在终端1100的前面板。触摸显示屏可被设计成为全面屏、曲面屏或异型屏。触摸显示屏还可被设计成为全面屏与曲面屏的结合,异型屏与曲面屏的结合,本申请实施例对此不加以限定。The screen 1130 may be a capacitive touch display screen for receiving a user's touch operation on or near it using any suitable object such as a finger, a stylus pen, etc., and displaying a user interface of each application program. The touch display screen is usually provided on the front panel of the terminal 1100 . The touch screen can be designed as a full screen, a curved screen or a special-shaped screen. The touch display screen can also be designed to be a combination of a full screen and a curved screen, or a combination of a special-shaped screen and a curved screen, which is not limited in the embodiments of the present application.

除此之外,本领域技术人员可以理解,上述附图所示出的终端1100的结构并不构成对终端1100的限定,终端可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。比如,终端1100中还包括射频电路、拍摄组件、传感器、音频电路、无线保真(Wireless Fidelity,WiFi)组件、电源、蓝牙组件等部件,在此不再赘述。In addition, those skilled in the art can understand that the structure of the terminal 1100 shown in the above drawings does not constitute a limitation on the terminal 1100, and the terminal may include more or less components than those shown in the drawings, or combine some components, or a different arrangement of components. For example, the terminal 1100 further includes components such as a radio frequency circuit, a photographing component, a sensor, an audio circuit, a wireless fidelity (Wireless Fidelity, WiFi) component, a power supply, and a Bluetooth component, which will not be repeated here.

本申请实施例还提供了一种计算机可读介质,该计算机可读介质存储有至少一条指令,所述至少一条指令由所述处理器加载并执行以实现如上各个实施例所述的到站提醒方法。Embodiments of the present application further provide a computer-readable medium, where at least one instruction is stored in the computer-readable medium, and the at least one instruction is loaded and executed by the processor to implement the arrival reminder as described in the above embodiments method.

本申请实施例还提供了一种计算机程序产品,该计算机程序产品存储有至少一条指令,所述至少一条指令由所述处理器加载并执行以实现如上各个实施例所述的到站提醒方法。Embodiments of the present application further provide a computer program product, where the computer program product stores at least one instruction, and the at least one instruction is loaded and executed by the processor to implement the arrival reminder method described in each of the above embodiments.

本领域技术人员应该可以意识到,在上述一个或多个示例中,本申请实施例所描述的功能可以用硬件、软件、固件或它们的任意组合来实现。当使用软件实现时,可以将这些功能存储在计算机可读介质中或者作为计算机可读介质上的一个或多个指令或代码进行传输。计算机可读介质包括计算机存储介质和通信介质,其中通信介质包括便于从一个地方向另一个地方传送计算机程序的任何介质。存储介质可以是通用或专用计算机能够存取的任何可用介质。Those skilled in the art should realize that, in one or more of the above examples, the functions described in the embodiments of the present application may be implemented by hardware, software, firmware, or any combination thereof. When implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage medium can be any available medium that can be accessed by a general purpose or special purpose computer.

以上所述仅为本申请的较佳实施例,并不用以限制本申请,凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。The above descriptions are only preferred embodiments of the present application, and are not intended to limit the present application. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present application shall be included in the protection of the present application. within the range.

Claims (16)

1.一种到站提醒方法,其特征在于,所述方法包括:1. an arrival reminder method, is characterized in that, described method comprises: 当处于交通工具时,通过麦克风采集环境音;When in a vehicle, collect ambient sound through a microphone; 对所述环境音进行识别;identifying the ambient sound; 当识别出所述环境音中包含目标警铃声时,对已行驶站数进行加一操作,所述目标警铃声为开门警铃声或关门警铃声;When it is recognized that the ambient sound contains a target alarm bell, the operation of adding one to the number of stations that have been traveled is performed, and the target alarm bell is an open-door alarm bell or a door-close alarm bell; 当所述已行驶站数达到目标站数时,进行到站提醒,所述目标站数为起始站点与目标站点之间的站数,所述目标站点是中转站点或目的地站点。When the number of traveled stations reaches the target number of stations, an arrival reminder is performed, and the target station number is the number of stations between the starting station and the target station, and the target station is a transfer station or a destination station. 2.根据权利要求1所述的方法,其特征在于,所述对所述环境音进行识别,包括:2. The method according to claim 1, wherein the identifying the ambient sound comprises: 对所述环境音对应的音频数据进行分帧处理,得到不同音频帧对应的音频数据;Framing processing is performed on the audio data corresponding to the ambient sound to obtain audio data corresponding to different audio frames; 对各个音频帧对应的音频数据进行特征提取,得到对应的音频特征矩阵;Perform feature extraction on the audio data corresponding to each audio frame to obtain a corresponding audio feature matrix; 将所述音频特征矩阵输入声音识别模型,得到所述声音识别模型输出的目标警铃声识别结果,所述目标警铃声识别结果用于指示所述音频帧中是否包含所述目标警铃声。Inputting the audio feature matrix into a sound recognition model to obtain a target alarm tone recognition result output by the voice recognition model, where the target alarm tone recognition result is used to indicate whether the audio frame contains the target alarm tone. 3.根据权利要求2所述的方法,其特征在于,所述声音识别模型包括一级声音识别模型和二级声音识别模型,所述二级声音识别模型的识别准确度高于所述一级声音识别模型的识别准确度;3. The method according to claim 2, wherein the sound recognition model comprises a first-level sound recognition model and a second-level sound recognition model, and the recognition accuracy of the second-level sound recognition model is higher than the first-level sound recognition model The recognition accuracy of the voice recognition model; 所述将所述音频特征矩阵输入声音识别模型,得到所述声音识别模型输出的所述目标警铃声识别结果,包括:The described audio feature matrix is input into the sound recognition model to obtain the target alarm tone recognition result output by the sound recognition model, including: 将所述音频特征矩阵输入所述一级声音识别模型,得到所述一级声音识别模型输出的第一识别结果;Inputting the audio feature matrix into the first-level sound recognition model to obtain the first recognition result output by the first-level sound recognition model; 若所述第一识别结果符合输出条件,则将所述第一识别结果作为所述目标警铃声识别结果并输出;If the first recognition result meets the output condition, the first recognition result is used as the target alarm tone recognition result and output; 若所述第一识别结果不符合所述输出条件,则将所述音频特征矩阵输入所述二级声音识别模型,得到所述二级声音识别模型输出的第二识别结果;If the first recognition result does not meet the output condition, then input the audio feature matrix into the secondary voice recognition model to obtain a second recognition result output by the secondary voice recognition model; 将所述第二识别结果作为所述目标警铃声识别结果并输出。The second recognition result is used as the target alarm tone recognition result and output. 4.根据权利要求3所述的方法,其特征在于,所述一级声音识别模型用于计算所述音频特征矩阵与特征数据库中的样本特征矩阵之间的相似度,所述样本特征矩阵为所述目标警铃声的音频特征矩阵;4. The method according to claim 3, wherein the first-level voice recognition model is used to calculate the similarity between the audio feature matrix and the sample feature matrix in the feature database, and the sample feature matrix is The audio feature matrix of the target alarm tone; 所述将所述音频特征矩阵输入所述一级声音识别模型,得到所述一级声音识别模型输出的第一识别结果之后,还包括:After inputting the audio feature matrix into the first-level sound recognition model, and obtaining the first recognition result output by the first-level sound recognition model, it also includes: 若所述第一识别结果指示所述相似度大于第一相似度阈值或小于第二相似度阈值,则确定所述第一识别结果符合所述输出条件,所述第一相似度阈值大于所述第二相似度阈值;If the first identification result indicates that the similarity is greater than a first similarity threshold or smaller than a second similarity threshold, it is determined that the first identification result meets the output condition, and the first similarity threshold is greater than the the second similarity threshold; 若所述第一识别结果指示所述相似度大于所述第二相似度阈值且小于所述第一相似度阈值,则确定所述第一识别结果不符合所述输出条件。If the first identification result indicates that the similarity is greater than the second similarity threshold and smaller than the first similarity threshold, it is determined that the first identification result does not meet the output condition. 5.根据权利要求3所述的方法,其特征在于,所述二级声音识别模型是采用卷积神经网络CNN的二分类模型,所述二分类模型根据正负样本训练,并且以焦点损失focal loss为损失函数,通过梯度下降算法训练得到。5. The method according to claim 3, wherein the secondary voice recognition model is a two-class model using a convolutional neural network (CNN), and the two-class model is trained according to positive and negative samples, and uses focal loss focal. loss is the loss function, which is obtained by training with the gradient descent algorithm. 6.根据权利要求2至5任一所述的方法,其特征在于,所述将所述音频特征矩阵输入声音识别模型,得到所述声音识别模型输出的目标警铃声识别结果之后,还包括:6. The method according to any one of claims 2 to 5, wherein the audio feature matrix is input into a sound recognition model, after obtaining the target alarm tone recognition result output by the sound recognition model, further comprising: 当预定时长内包含所述目标警铃声的音频帧的个数达到个数阈值时,确定所述环境音中包含所述目标警铃声。When the number of audio frames containing the target alarm tone within a predetermined duration reaches a threshold, it is determined that the ambient sound contains the target alarm tone. 7.根据权利要求1至6任一所述的方法,其特征在于,所述对已行驶站数进行加一操作,包括:7. The method according to any one of claims 1 to 6, wherein the operation of adding one to the number of traveled stations comprises: 获取上一警铃识别时刻,所述上一警铃识别时刻为上一次识别出所述环境音中包含所述目标警铃声的时刻;Obtain the last alarm bell recognition time, the last alarm bell recognition time is the last time when the target alarm bell was recognized in the ambient sound; 若所述上一警铃识别时刻与当前警铃识别时刻之间的时间间隔大于时间间隔阈值,则对所述已行驶站数进行加一操作。If the time interval between the last alarm bell recognition time and the current alarm bell recognition time is greater than the time interval threshold, the operation of adding one to the number of traveled stations is performed. 8.一种到站提醒装置,其特征在于,所述装置包括:8. An arrival reminder device, wherein the device comprises: 采集模块,用于当处于交通工具时,通过麦克风采集环境音;The acquisition module is used to collect ambient sound through a microphone when it is in a vehicle; 识别模块,用于对所述环境音进行识别;an identification module for identifying the ambient sound; 计数模块,用于当识别出所述环境音中包含目标警铃声时,对已行驶站数进行加一操作,所述目标警铃声为开门警铃声或关门警铃声;a counting module, used for adding one to the number of stations that have been traveled when it is identified that the ambient sound contains a target alarm tone, and the target alarm tone is an open-door alarm tone or a door-closed alarm tone; 提醒模块,用于当所述已行驶站数达到目标站数时,进行到站提醒,所述目标站数为起始站点与目标站点之间的站数,所述目标站点是中转站点或目的地站点。Reminder module, used to remind the arrival of the station when the number of the traveled stations reaches the target station number, the target station number is the station number between the starting station and the target station, and the target station is a transfer station or a destination local site. 9.根据权利要求8所述的装置,其特征在于,所述识别模块,包括:9. The device according to claim 8, wherein the identification module comprises: 分帧处理单元,用于对所述环境音对应的音频数据进行分帧处理,得到不同音频帧对应的音频数据;a frame-by-frame processing unit, configured to perform frame-by-frame processing on the audio data corresponding to the ambient sound to obtain audio data corresponding to different audio frames; 特征提取单元,用于对各个音频帧对应的音频数据进行特征提取,得到对应的音频特征矩阵;a feature extraction unit, configured to perform feature extraction on the audio data corresponding to each audio frame to obtain a corresponding audio feature matrix; 声音识别单元,用于将所述音频特征矩阵输入声音识别模型,得到所述声音识别模型输出的目标警铃声识别结果,所述目标警铃声识别结果用于指示所述音频帧中是否包含所述目标警铃声。The sound recognition unit is used for inputting the audio feature matrix into a sound recognition model to obtain a target alarm tone recognition result output by the voice recognition model, and the target alarm tone recognition result is used to indicate whether the audio frame contains the Target alarm bell. 10.根据权利要求9所述的装置,其特征在于,所述声音识别模型包括一级声音识别模型和二级声音识别模型,所述二级声音识别模型的识别准确度高于所述一级声音识别模型的识别准确度;10 . The device according to claim 9 , wherein the voice recognition model comprises a first-level voice recognition model and a second-level voice recognition model, and the recognition accuracy of the second-level voice recognition model is higher than that of the first-level voice recognition model. 11 . The recognition accuracy of the voice recognition model; 所述声音识别单元,还用于:The voice recognition unit is also used for: 将所述音频特征矩阵输入所述一级声音识别模型,得到所述一级声音识别模型输出的第一识别结果;Inputting the audio feature matrix into the first-level sound recognition model to obtain the first recognition result output by the first-level sound recognition model; 若所述第一识别结果符合输出条件,则将所述第一识别结果作为所述目标警铃声识别结果并输出;If the first recognition result meets the output condition, the first recognition result is used as the target alarm tone recognition result and output; 若所述第一识别结果不符合所述输出条件,则将所述音频特征矩阵输入所述二级声音识别模型,得到所述二级声音识别模型输出的第二识别结果;If the first recognition result does not meet the output condition, then input the audio feature matrix into the secondary voice recognition model to obtain a second recognition result output by the secondary voice recognition model; 将所述第二识别结果作为所述目标警铃声识别结果并输出。The second recognition result is used as the target alarm tone recognition result and output. 11.根据权利要求10所述的装置,其特征在于,所述一级声音识别模型用于计算所述音频特征矩阵与特征数据库中的样本特征矩阵之间的相似度,所述样本特征矩阵为所述目标警铃声的音频特征矩阵;11. The device according to claim 10, wherein the first-level voice recognition model is used to calculate the similarity between the audio feature matrix and the sample feature matrix in the feature database, and the sample feature matrix is The audio feature matrix of the target alarm tone; 所述声音识别单元,还用于:The voice recognition unit is also used for: 若所述第一识别结果指示所述相似度大于第一相似度阈值或小于第二相似度阈值,则确定所述第一识别结果符合所述输出条件,所述第一相似度阈值大于所述第二相似度阈值;If the first identification result indicates that the similarity is greater than a first similarity threshold or smaller than a second similarity threshold, it is determined that the first identification result meets the output condition, and the first similarity threshold is greater than the the second similarity threshold; 若所述第一识别结果指示所述相似度大于所述第二相似度阈值且小于所述第一相似度阈值,则确定所述第一识别结果不符合所述输出条件。If the first identification result indicates that the similarity is greater than the second similarity threshold and smaller than the first similarity threshold, it is determined that the first identification result does not meet the output condition. 12.根据权利要求10所述的装置,其特征在于,所述二级声音识别模型是采用CNN的二分类模型,所述二分类模型根据正负样本训练,并且以focalloss为损失函数,通过梯度下降算法训练得到。12. The device according to claim 10, wherein the secondary voice recognition model is a binary classification model using CNN, and the binary classification model is trained according to positive and negative samples, and takes focalloss as a loss function, and uses gradients as a loss function. The descent algorithm is trained. 13.根据权利要求9至12任一所述的装置,其特征在于,所述识别模块还包括:13. The device according to any one of claims 9 to 12, wherein the identification module further comprises: 确定单元,用于当预定时长内包含所述目标警铃声的音频帧的个数达到个数阈值时,确定所述环境音中包含所述目标警铃声。A determining unit, configured to determine that the ambient sound contains the target alarm tone when the number of audio frames containing the target alarm tone within a predetermined duration reaches a threshold number. 14.根据权利要求8至13任一所述的装置,其特征在于,所述计数模块包括:14. The device according to any one of claims 8 to 13, wherein the counting module comprises: 获取单元,用于获取上一警铃识别时刻,所述上一警铃识别时刻为上一次识别出所述环境音中包含所述目标警铃声的时刻;an acquisition unit, configured to acquire the last alarm bell recognition time, where the last alarm bell recognition time is the time when the target alarm bell was identified in the ambient sound last time; 比较单元,用于若所述上一警铃识别时刻与当前警铃识别时刻之间的时间间隔大于时间间隔阈值,则对所述已行驶站数进行加一操作。A comparison unit, configured to perform an operation of adding one to the number of traveled stations if the time interval between the last alarm bell recognition time and the current alarm bell recognition time is greater than a time interval threshold. 15.一种终端,其特征在于,所述终端包括处理器和存储器;所述存储器存储有至少一条指令,所述至少一条指令用于被所述处理器执行以实现如权利要求1至7任一所述的到站提醒方法。15. A terminal, characterized in that the terminal comprises a processor and a memory; the memory stores at least one instruction, and the at least one instruction is used to be executed by the processor to implement any one of claims 1 to 7. 1. The described arrival reminder method. 16.一种计算机可读存储介质,其特征在于,所述存储介质存储有至少一条指令,所述至少一条指令用于被处理器执行以实现如权利要求1至7任一所述的到站提醒方法。16. A computer-readable storage medium, characterized in that the storage medium stores at least one instruction, and the at least one instruction is used to be executed by a processor to implement the arrival station according to any one of claims 1 to 7 reminder method.
CN201910897452.7A 2019-09-23 2019-09-23 Arrival reminder method, device, terminal and storage medium Active CN110660201B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910897452.7A CN110660201B (en) 2019-09-23 2019-09-23 Arrival reminder method, device, terminal and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910897452.7A CN110660201B (en) 2019-09-23 2019-09-23 Arrival reminder method, device, terminal and storage medium

Publications (2)

Publication Number Publication Date
CN110660201A true CN110660201A (en) 2020-01-07
CN110660201B CN110660201B (en) 2021-07-09

Family

ID=69038806

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910897452.7A Active CN110660201B (en) 2019-09-23 2019-09-23 Arrival reminder method, device, terminal and storage medium

Country Status (1)

Country Link
CN (1) CN110660201B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111325386A (en) * 2020-02-11 2020-06-23 Oppo广东移动通信有限公司 Method, device, terminal and storage medium for predicting running state of vehicle
CN111354371A (en) * 2020-02-26 2020-06-30 Oppo广东移动通信有限公司 Method, device, terminal and storage medium for predicting running state of vehicle
CN111402617A (en) * 2020-03-12 2020-07-10 Oppo广东移动通信有限公司 Site information determination method, device, terminal and storage medium
CN111405478A (en) * 2020-03-02 2020-07-10 Oppo广东移动通信有限公司 Service providing method, device, terminal and storage medium
CN111710346A (en) * 2020-06-18 2020-09-25 腾讯科技(深圳)有限公司 Audio processing method and device, computer equipment and storage medium
CN112216140A (en) * 2020-09-18 2021-01-12 华为技术有限公司 Vehicle arrival confirmation method, electronic device, and computer-readable storage medium
WO2021115232A1 (en) * 2019-12-10 2021-06-17 Oppo广东移动通信有限公司 Arrival reminding method and device, terminal, and storage medium
WO2021169757A1 (en) * 2020-02-27 2021-09-02 Oppo广东移动通信有限公司 Method and apparatus for giving reminder of arrival at station, storage medium and electronic device
WO2021190145A1 (en) * 2020-03-25 2021-09-30 Oppo广东移动通信有限公司 Station identifying method and device, terminal and storage medium
CN113780978A (en) * 2021-08-12 2021-12-10 上海瑾盛通信科技有限公司 Arrival reminding method and device, storage medium and electronic equipment
CN113984078A (en) * 2021-10-26 2022-01-28 上海瑾盛通信科技有限公司 Arrival reminding method, device, terminal and storage medium

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN2896395Y (en) * 2006-03-28 2007-05-02 宇龙计算机通信科技(深圳)有限公司 Subway train arriving-station promptor
CN103632664A (en) * 2012-08-20 2014-03-12 联想(北京)有限公司 A method for speech recognition and an electronic device
CN104992713A (en) * 2015-05-14 2015-10-21 电子科技大学 A Fast Broadcast Audio Comparison Method
CN205028436U (en) * 2015-08-28 2016-02-10 江苏太元智音信息技术有限公司 Reminding device arrives at a station based on voice recognition
CN105577943A (en) * 2016-02-03 2016-05-11 上海卓易科技股份有限公司 Bus stop reporting prompting method and system and mobile terminal
KR101666828B1 (en) * 2015-07-20 2016-10-17 (주)아이앤에이 The method and system for goods position information express to utilize nametag
CN106558232A (en) * 2015-09-30 2017-04-05 中国电信股份有限公司 A kind of user's arrival reminding method, system and arrival reminding server
CN107146615A (en) * 2017-05-16 2017-09-08 南京理工大学 Speech Recognition Method and System Based on Secondary Recognition of Matching Model
CN107545763A (en) * 2016-06-28 2018-01-05 高德信息技术有限公司 A kind of vehicle positioning method, terminal, server and system
CN107928673A (en) * 2017-11-06 2018-04-20 腾讯科技(深圳)有限公司 Acoustic signal processing method, device, storage medium and computer equipment
CN108962243A (en) * 2018-06-28 2018-12-07 宇龙计算机通信科技(深圳)有限公司 arrival reminding method and device, mobile terminal and computer readable storage medium
JP2019028301A (en) * 2017-07-31 2019-02-21 日本電信電話株式会社 Acoustic signal processing apparatus, method and program

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN2896395Y (en) * 2006-03-28 2007-05-02 宇龙计算机通信科技(深圳)有限公司 Subway train arriving-station promptor
CN103632664A (en) * 2012-08-20 2014-03-12 联想(北京)有限公司 A method for speech recognition and an electronic device
CN104992713A (en) * 2015-05-14 2015-10-21 电子科技大学 A Fast Broadcast Audio Comparison Method
KR101666828B1 (en) * 2015-07-20 2016-10-17 (주)아이앤에이 The method and system for goods position information express to utilize nametag
CN205028436U (en) * 2015-08-28 2016-02-10 江苏太元智音信息技术有限公司 Reminding device arrives at a station based on voice recognition
CN106558232A (en) * 2015-09-30 2017-04-05 中国电信股份有限公司 A kind of user's arrival reminding method, system and arrival reminding server
CN105577943A (en) * 2016-02-03 2016-05-11 上海卓易科技股份有限公司 Bus stop reporting prompting method and system and mobile terminal
CN107545763A (en) * 2016-06-28 2018-01-05 高德信息技术有限公司 A kind of vehicle positioning method, terminal, server and system
CN107146615A (en) * 2017-05-16 2017-09-08 南京理工大学 Speech Recognition Method and System Based on Secondary Recognition of Matching Model
JP2019028301A (en) * 2017-07-31 2019-02-21 日本電信電話株式会社 Acoustic signal processing apparatus, method and program
CN107928673A (en) * 2017-11-06 2018-04-20 腾讯科技(深圳)有限公司 Acoustic signal processing method, device, storage medium and computer equipment
CN108962243A (en) * 2018-06-28 2018-12-07 宇龙计算机通信科技(深圳)有限公司 arrival reminding method and device, mobile terminal and computer readable storage medium

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021115232A1 (en) * 2019-12-10 2021-06-17 Oppo广东移动通信有限公司 Arrival reminding method and device, terminal, and storage medium
CN111325386A (en) * 2020-02-11 2020-06-23 Oppo广东移动通信有限公司 Method, device, terminal and storage medium for predicting running state of vehicle
WO2021159987A1 (en) * 2020-02-11 2021-08-19 Oppo广东移动通信有限公司 Method and device for predicting operating state of vehicle, terminal, and storage medium
CN111354371A (en) * 2020-02-26 2020-06-30 Oppo广东移动通信有限公司 Method, device, terminal and storage medium for predicting running state of vehicle
CN111354371B (en) * 2020-02-26 2022-08-05 Oppo广东移动通信有限公司 Method, device, terminal and storage medium for predicting running state of vehicle
WO2021169742A1 (en) * 2020-02-26 2021-09-02 Oppo广东移动通信有限公司 Method and device for predicting operating state of transportation means, and terminal and storage medium
WO2021169757A1 (en) * 2020-02-27 2021-09-02 Oppo广东移动通信有限公司 Method and apparatus for giving reminder of arrival at station, storage medium and electronic device
CN111405478A (en) * 2020-03-02 2020-07-10 Oppo广东移动通信有限公司 Service providing method, device, terminal and storage medium
WO2021175062A1 (en) * 2020-03-02 2021-09-10 Oppo广东移动通信有限公司 Service providing method and apparatus, and terminal and storage medium
CN111402617B (en) * 2020-03-12 2022-05-13 Oppo广东移动通信有限公司 Site information determination method, device, terminal and storage medium
CN111402617A (en) * 2020-03-12 2020-07-10 Oppo广东移动通信有限公司 Site information determination method, device, terminal and storage medium
WO2021190145A1 (en) * 2020-03-25 2021-09-30 Oppo广东移动通信有限公司 Station identifying method and device, terminal and storage medium
CN111710346A (en) * 2020-06-18 2020-09-25 腾讯科技(深圳)有限公司 Audio processing method and device, computer equipment and storage medium
CN112216140A (en) * 2020-09-18 2021-01-12 华为技术有限公司 Vehicle arrival confirmation method, electronic device, and computer-readable storage medium
CN113780978A (en) * 2021-08-12 2021-12-10 上海瑾盛通信科技有限公司 Arrival reminding method and device, storage medium and electronic equipment
CN113780978B (en) * 2021-08-12 2023-12-19 上海瑾盛通信科技有限公司 Arrival reminding method and device, storage medium and electronic equipment
CN113984078A (en) * 2021-10-26 2022-01-28 上海瑾盛通信科技有限公司 Arrival reminding method, device, terminal and storage medium
CN113984078B (en) * 2021-10-26 2024-03-08 上海瑾盛通信科技有限公司 Arrival reminding method, device, terminal and storage medium

Also Published As

Publication number Publication date
CN110660201B (en) 2021-07-09

Similar Documents

Publication Publication Date Title
CN110660201B (en) Arrival reminder method, device, terminal and storage medium
WO2021169742A1 (en) Method and device for predicting operating state of transportation means, and terminal and storage medium
CN111325386B (en) Method, device, terminal and storage medium for predicting running state of vehicle
CN110364143B (en) Voice awakening method and device and intelligent electronic equipment
CN110972112B (en) Subway running direction determining method, device, terminal and storage medium
CN110880328B (en) Arrival reminding method, device, terminal and storage medium
ES2800348T3 (en) Method and system for speaker verification
CN110534099B (en) Voice wake-up processing method and device, storage medium and electronic equipment
CN105662797B (en) A kind of Intelligent internet of things blind-guiding stick
WO2021115232A1 (en) Arrival reminding method and device, terminal, and storage medium
CN108564942A (en) One kind being based on the adjustable speech-emotion recognition method of susceptibility and system
CN108074576A (en) Inquest the speaker role's separation method and system under scene
CN110428854B (en) Voice endpoint detection method and device for vehicle-mounted terminal and computer equipment
CN106686223A (en) Auxiliary dialogue system, method and smart phone for deaf-mute and normal people
CN114127849A (en) Speech emotion recognition method and device
GB2522506A (en) Audio based system method for in-vehicle context classification
WO2021190145A1 (en) Station identifying method and device, terminal and storage medium
WO2018233300A1 (en) Voice recognition method and voice recognition device
CN110930643A (en) An intelligent safety system and method for preventing infants from being left in a car
WO2023071768A1 (en) Station-arrival reminding method and apparatus, and terminal, storage medium and program product
CN117636872A (en) Audio processing method, device, electronic equipment and readable storage medium
CN107592422A (en) A kind of identity identifying method and system based on gesture feature
WO2023137908A1 (en) Sound recognition method and apparatus, medium, device, program product and vehicle
WO2021169757A1 (en) Method and apparatus for giving reminder of arrival at station, storage medium and electronic device
WO2020073839A1 (en) Voice wake-up method, apparatus and system, and electronic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant