CN116030830A - Prompt broadcasting system for aircraft crews and method thereof - Google Patents

Prompt broadcasting system for aircraft crews and method thereof Download PDF

Info

Publication number
CN116030830A
CN116030830A CN202310308003.0A CN202310308003A CN116030830A CN 116030830 A CN116030830 A CN 116030830A CN 202310308003 A CN202310308003 A CN 202310308003A CN 116030830 A CN116030830 A CN 116030830A
Authority
CN
China
Prior art keywords
aircraft
feature
matrix
aircraft vibration
vibration
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202310308003.0A
Other languages
Chinese (zh)
Inventor
贾超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Binzhou University
Original Assignee
Binzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Binzhou University filed Critical Binzhou University
Priority to CN202310308003.0A priority Critical patent/CN116030830A/en
Publication of CN116030830A publication Critical patent/CN116030830A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T90/00Enabling technologies or technologies with a potential or indirect contribution to GHG emissions mitigation

Landscapes

  • Image Analysis (AREA)

Abstract

The invention relates to the broadcasting field, and particularly discloses a prompt broadcasting system for an aircraft attendant and a method thereof, which are characterized in that vibration frequency domain related characteristics and audio waveform semantic characteristics in the broadcasting process are fused by adopting a neural network model based on deep learning, and the audio data to be played are further adaptively corrected based on the vibration condition of an aircraft so as to pre-compensate an audio signal before the audio signal is transmitted, so that the transmission offset of the audio signal caused by the fluctuation of the aircraft is avoided to ensure that the audio signal can be transmitted to the ears of passengers in a fidelity manner.

Description

Prompt broadcasting system for aircraft crews and method thereof
Technical Field
The invention relates to the broadcasting field, in particular to a prompt broadcasting system for an aircraft crew member and a method thereof.
Background
The flight attendant cue broadcasting system is an important auxiliary support system in the navigation process of the aircraft, and the flight attendant can transmit information to passengers through the flight attendant cue broadcasting system. However, during the broadcasting of the aircraft crews, the propagation of the sound signal is shifted due to the aircraft fluctuation, so that some passengers may not hear the emphasis, and even the trouble, ambiguity and dislocation may occur.
Accordingly, an optimized aircraft attendant cue-play system is desired.
Disclosure of Invention
The present invention has been made to solve the above-mentioned technical problems. The embodiment of the invention provides a prompt broadcasting system for an aircraft attendant and a method thereof, which are characterized in that vibration frequency domain related characteristics and audio waveform semantic characteristics in the broadcasting process are fused by adopting a neural network model based on deep learning, the audio data to be broadcast are further adaptively corrected based on the vibration condition of an aircraft so as to pre-compensate an audio signal before the audio signal is transmitted, and in such a way, the transmission offset of the audio signal caused by the fluctuation of the aircraft is avoided so as to ensure that the audio signal can be transmitted to the ears of passengers in a fidelity manner.
According to one aspect of the present invention, there is provided an aircraft attendant cue-casting system comprising: the system comprises a data acquisition module to be played, a data processing module and a data processing module, wherein the data acquisition module to be played is used for acquiring audio data to be played in a preset time period provided by an aircraft attendant and an aircraft vibration signal in the preset time period; the vibration characteristic extraction module is used for carrying out frequency domain characteristic analysis on the aircraft vibration signals so as to obtain a plurality of aircraft vibration frequency domain statistical characteristics; the multi-mode coding module is used for inputting the aircraft vibration signals and the plurality of aircraft vibration frequency domain statistical characteristics into a multi-mode joint encoder comprising an image encoder and a sequence encoder so as to obtain an aircraft vibration characteristic matrix; the audio waveform feature extraction module is used for enabling the audio data to be played in the preset time period to pass through a convolutional neural network model serving as a feature extractor so as to obtain an audio waveform image feature matrix; the feature fusion module is used for fusing the aircraft vibration feature matrix and the audio waveform image feature matrix to obtain a fused feature matrix; the countermeasure generation module is used for enabling the fusion feature matrix to pass through a generator based on a countermeasure generation network to obtain corrected audio data to be played; and the broadcasting module is used for broadcasting the corrected audio data to be played.
In the foregoing prompt broadcast system for an aircraft attendant, the image encoder is a convolutional neural network model serving as a filter, and the sequence encoder is a multi-scale neighborhood feature extraction module, where the multi-scale neighborhood feature extraction module includes: the device comprises a first convolution layer, a second convolution layer parallel to the first convolution layer and a multi-scale feature fusion layer connected with the first convolution layer and the second convolution layer, wherein the first convolution layer uses a one-dimensional convolution kernel with a first length, and the second convolution layer uses a one-dimensional convolution kernel with a second length.
In the above-mentioned prompt broadcast system for aircraft crews, the multi-mode coding module includes: the frequency domain feature extraction unit is used for inputting the plurality of aircraft vibration frequency domain statistical features into the sequence encoder to obtain aircraft vibration frequency domain statistical feature vectors; the vibration waveform characteristic extraction unit is used for inputting the aircraft vibration signal into the image encoder to obtain an aircraft vibration waveform characteristic vector; and the joint optimization unit is used for calculating a vector product between the transpose vector of the aircraft vibration waveform characteristic vector and the aircraft vibration frequency domain statistical characteristic vector to obtain the audio waveform image characteristic matrix.
In the above-mentioned prompt broadcast system for aircraft crews, the audio waveform feature extraction module is configured to: each layer of the convolutional neural network model using the feature extractor performs, in forward transfer of the layer, input data: carrying out convolution processing on input data to obtain a convolution characteristic diagram; pooling the convolution feature map along a channel dimension to obtain a pooled feature map; performing nonlinear activation on the pooled feature map to obtain an activated feature map; the output of the last layer of the convolutional neural network serving as the feature extractor is the audio waveform image feature matrix, and the input of the first layer of the convolutional neural network serving as the feature extractor is the audio data to be played in the preset time period.
In the foregoing alert broadcast system for an aircraft attendant, the feature fusion module includes: the characteristic matrix unfolding unit is used for unfolding the aircraft vibration characteristic matrix and the audio waveform image characteristic matrix into aircraft vibration characteristic vectors and audio waveform image characteristic vectors; an affine mapping factor calculation unit for calculating a correlation-probability density distribution affine mapping factor between the aircraft vibration feature vector and the audio waveform image feature vector to obtain a first correlation-probability density distribution affine mapping factor and a second correlation-probability density distribution affine mapping factor; and the fusion unit is used for calculating the weighted sum according to the position between the aircraft vibration characteristic matrix and the audio waveform image characteristic matrix by taking the affine mapping factors of the first association-probability density distribution and the affine mapping factors of the second association-probability density distribution as weights so as to obtain the fusion characteristic matrix.
In the above-described aircraft attendant cue broadcasting system, the affine mapping factor calculating unit is configured to: calculating a correlation-probability density distribution affine mapping factor of the aircraft vibration feature vector and the audio waveform image feature vector in the following optimization formula to obtain a first correlation-probability density distribution affine mapping factor and a second correlation-probability density distribution affine mapping factor; wherein, the optimization formula is:
Figure SMS_2
Figure SMS_4
wherein->
Figure SMS_9
Representing the aircraft vibration feature vector, +.>
Figure SMS_3
Representing the audio waveform image feature vector, +.>
Figure SMS_6
A correlation matrix obtained for position-by-position correlation between the aircraft vibration feature vector and the audio waveform image feature vector,>
Figure SMS_7
and->
Figure SMS_10
Is the mean vector and covariance matrix of the Gaussian density map formed by the aircraft vibration characteristic vector and the audio waveform image characteristic vector,/A>
Figure SMS_1
Representing matrix multiplication, representing->
Figure SMS_5
An exponential operation representing a matrix representing the calculation of a natural exponential function value raised to a power by a characteristic value at each position in the matrix,/v>
Figure SMS_8
Affine mapping factors representing said first correlation-probability density distribution,>
Figure SMS_11
representing the second associative-probability density distribution affine mapping factor.
In the above-described aircraft attendant cue-broadcasting system, the countermeasure-generating network includes a generator and a discriminator.
According to another aspect of the present invention, there is provided an alert broadcasting method for an aircraft attendant, including: acquiring audio data to be played for a predetermined period of time provided by an aircraft attendant, and an aircraft vibration signal for the predetermined period of time; performing frequency domain feature analysis on the aircraft vibration signals to obtain a plurality of aircraft vibration frequency domain statistical features; inputting the aircraft vibration signals and the plurality of aircraft vibration frequency domain statistical features into a multi-mode joint encoder comprising an image encoder and a sequence encoder to obtain an aircraft vibration feature matrix; the audio data to be played in the preset time period passes through a convolutional neural network model serving as a feature extractor to obtain an audio waveform image feature matrix; fusing the aircraft vibration feature matrix and the audio waveform image feature matrix to obtain a fused feature matrix; the fusion feature matrix passes through a generator based on a countermeasure generation network to obtain corrected audio data to be played; and playing the corrected audio data to be played.
According to still another aspect of the present invention, there is provided an electronic apparatus including: a processor; and a memory having stored therein computer program instructions that, when executed by the processor, cause the processor to perform the aircraft attendant prompt broadcast method as described above.
According to a further aspect of the present invention there is provided a computer readable medium having stored thereon computer program instructions which, when executed by a processor, cause the processor to perform an aircraft attendant cue broadcasting method as described above.
Compared with the prior art, the prompt broadcasting system and the method for the aircraft crews, provided by the invention, have the advantages that the vibration frequency domain associated characteristic and the audio waveform semantic characteristic in the broadcasting process are fused by adopting the neural network model based on deep learning, the audio data to be played is further adaptively corrected based on the vibration condition of the aircraft so as to pre-compensate the audio signal before the audio signal is transmitted, and in such a way, the transmission offset of the audio signal caused by the fluctuation of the aircraft is avoided so as to ensure that the sound signal can be transmitted to the ears of passengers in a fidelity manner.
Drawings
The above and other objects, features and advantages of the present invention will become more apparent by describing embodiments of the present invention in more detail with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of embodiments of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, and not constitute a limitation to the invention. In the drawings, like reference numerals generally refer to like parts or steps.
Fig. 1 is a schematic view of a scene of an aircraft attendant cue-play system according to an embodiment of the present invention.
Fig. 2 is a block diagram of an aircraft attendant cue-casting system according to an embodiment of the present invention.
Fig. 3 is a system architecture diagram of an aircraft attendant cue-play system according to an embodiment of the present invention.
Fig. 4 is a block diagram of a multi-modal encoding module in an aircraft attendant hint broadcast system according to an embodiment of the present invention.
Fig. 5 is a block diagram of a feature fusion module in an aircraft attendant hint broadcasting system according to an embodiment of the present invention.
Fig. 6 is a flowchart of an aircraft attendant cue-casting method according to an embodiment of the present invention.
Fig. 7 is a block diagram of an electronic device according to an embodiment of the invention.
Detailed Description
Hereinafter, exemplary embodiments according to the present invention will be described in detail with reference to the accompanying drawings. It should be apparent that the described embodiments are only some embodiments of the present invention and not all embodiments of the present invention, and it should be understood that the present invention is not limited by the example embodiments described herein.
Summary of the application: as described in the foregoing background art, during the broadcasting process of the aircraft crew member, the propagation of the sound signal is shifted due to the aircraft fluctuation, so that some passengers may not hear the emphasis, and even the trouble, ambiguity and dislocation are generated. That is, the audio data provided by the aircraft attendant, when played, may be subject to propagation deviations due to aircraft vibrations such that the occupants may not hear emphasis and even experience ambiguity, and misalignment.
In order to solve the technical problem, in the technical scheme of the invention, before the audio data to be played is played, the audio data to be played is adaptively corrected based on the vibration condition of an airplane so as to pre-compensate the audio signal before the audio signal is transmitted, and in this way, the transmission offset of the audio signal caused by the fluctuation of the airplane is avoided so as to ensure that the sound signal can be transmitted to the ears of passengers in a fidelity manner.
Specifically, audio data to be played for a predetermined period of time provided by an aircraft attendant is first acquired, and an aircraft vibration signal for the predetermined period of time. In order to extract the vibration characteristics of the aircraft in the preset time period more accurately, in the technical scheme of the invention, firstly, frequency domain characteristic analysis is carried out on the aircraft vibration signals to obtain a plurality of aircraft vibration frequency domain statistical characteristics, and then, the aircraft vibration signals and the plurality of aircraft vibration frequency domain statistical characteristics are input into a multi-mode joint encoder comprising an image encoder and a sequence encoder to obtain an aircraft vibration characteristic matrix. Here, it should be noted that the statistical features of the vibration frequency domains of the plurality of aircraft are discrete data, and the vibration signals of the aircraft are two-dimensional waveform diagrams, which belong to data of different modes, so in the technical scheme of the invention, the multi-mode joint encoder comprising the image encoder and the sequence encoder is used for extracting single-mode features of the vibration signals of the aircraft and the statistical features of the vibration frequency domains of the plurality of aircraft, and then carrying out mode feature fusion to obtain the vibration feature matrix of the aircraft.
And then, the audio data to be played in the preset time period passes through a convolutional neural network model serving as a feature extractor to obtain an audio waveform image feature matrix. That is, in the technical solution of the present invention, the audio data to be played is regarded as a two-dimensional waveform chart, and a convolutional neural network model with excellent performance in the field of image feature extraction is used as a feature extractor to capture high-dimensional image hidden features in the audio data to be played, that is, the audio waveform image feature matrix. After the aircraft vibration feature matrix and the audio waveform image feature matrix are obtained, the aircraft vibration feature matrix and the audio waveform image feature matrix are fused in a high-dimensional feature space to obtain a fused feature matrix containing aircraft vibration features and audio waveform features.
And then, the fusion characteristic matrix passes through a generator based on a countermeasure generation network to obtain corrected audio data to be played. That is, the corrected audio data to be played corresponding to the fusion feature matrix is fitted based on the countermeasure generation idea. It will be appreciated by those of ordinary skill in the art that the countermeasure generation network includes a discriminator and a generator, wherein during training, the generator is configured to generate corrected audio data to be played, and the discriminator is configured to measure a difference between the corrected audio data to be played and the actual corrected audio data to be played to obtain a discriminator loss function value, and update neural network parameters of the generator with the discriminator loss function value as the loss function value and by a gradient-decreasing back propagation algorithm such that the corrected audio data to be played generated by the generator is approximated to the actual corrected audio data to be played, in such a manner that the quality of the corrected audio data to be played is improved.
And playing the corrected audio data to be played after the corrected audio data to be played are obtained. That is, before playing the audio data to be played, the audio data to be played is adaptively corrected based on the vibration condition of the aircraft to perform audio signal precompensation before performing audio signal propagation, in this way, propagation offset of the audio signal caused by aircraft fluctuation is avoided to ensure that the sound signal can be transmitted to the ears of the passengers with fidelity.
Here, when the aircraft vibration feature matrix and the audio waveform image feature matrix are fused, for example, by a point-adding manner, to obtain a fused feature matrix, since the aircraft vibration feature matrix and the audio waveform image feature matrix express vibration frequency domain association features and audio waveform semantic features, respectively, if the correlation between the overall feature distribution of the aircraft vibration feature matrix and the audio waveform image feature matrix and the consistency of probability density distribution can be improved, the feature expression effect of the fused feature matrix can be improved by improving the fusion effect of the aircraft vibration feature matrix and the audio waveform image feature matrix, so that the quality of the corrected audio data to be played generated by the fused feature matrix can be improved.
The applicant of the present invention therefore first expands the aircraft vibration feature matrix and the audio waveform image feature matrix into aircraft vibration feature vectors, e.g., denoted as
Figure SMS_13
And an audio waveform image feature vector, e.g., denoted +.>
Figure SMS_16
Calculating the vibration characteristic vector of the airplane>
Figure SMS_19
And the audio waveform image feature vector +.>
Figure SMS_15
Affine mapping factor of the correlation-probability density distribution representing The method comprises the following steps: />
Figure SMS_17
Figure SMS_21
Figure SMS_23
For the aircraft vibration feature vector +.>
Figure SMS_12
And the audio waveform image feature vector +.>
Figure SMS_20
An association matrix obtained by position-by-position association between the two, < >>
Figure SMS_22
And->
Figure SMS_24
Is the aircraft vibration feature vector +.>
Figure SMS_14
And the audio waveform image feature vector +.>
Figure SMS_18
The mean vector and covariance matrix of the constructed gaussian density map.
That is, by constructing the aircraft vibration feature vector
Figure SMS_25
And the audio waveform image feature vector +.>
Figure SMS_26
The associated feature space and the probability density space represented by Gaussian probability density can be obtained by fitting the aircraft vibration feature vector +.>
Figure SMS_27
And the audio waveform image feature vector +.>
Figure SMS_28
Mapping into affine homography subspaces within an associated feature space and a probability density space, respectively, to extract affine homography-compliant representations of feature representations within an associated feature domain and a probability density domain by distributing affine mapping factor values +.>
Figure SMS_29
And->
Figure SMS_30
The aircraft vibration feature matrix and the audio waveform image feature matrix are weighted respectively, so that the relevance of the feature representation of the aircraft vibration feature matrix and the audio waveform image feature matrix and the consistency of the probability density distribution can be improved, the feature expression effect of the fusion feature matrix obtained through fusion is improved, and the quality of audio data to be played after correction generated by the fusion feature matrix is improved.
Based on this, the invention proposes a prompt broadcasting system for aircraft crews, comprising: the system comprises a data acquisition module to be played, a data processing module and a data processing module, wherein the data acquisition module to be played is used for acquiring audio data to be played in a preset time period provided by an aircraft attendant and an aircraft vibration signal in the preset time period; the vibration characteristic extraction module is used for carrying out frequency domain characteristic analysis on the aircraft vibration signals so as to obtain a plurality of aircraft vibration frequency domain statistical characteristics; the multi-mode coding module is used for inputting the aircraft vibration signals and the plurality of aircraft vibration frequency domain statistical characteristics into a multi-mode joint encoder comprising an image encoder and a sequence encoder so as to obtain an aircraft vibration characteristic matrix; the audio waveform feature extraction module is used for enabling the audio data to be played in the preset time period to pass through a convolutional neural network model serving as a feature extractor so as to obtain an audio waveform image feature matrix; the feature fusion module is used for fusing the aircraft vibration feature matrix and the audio waveform image feature matrix to obtain a fused feature matrix; the countermeasure generation module is used for enabling the fusion feature matrix to pass through a generator based on a countermeasure generation network to obtain corrected audio data to be played; and the broadcasting module is used for broadcasting the corrected audio data to be played.
Fig. 1 is a schematic view of a scene of an aircraft attendant cue-play system according to an embodiment of the present invention. As shown in fig. 1, in this application scenario. The audio data to be played for a predetermined period of time provided by the aircraft attendant is acquired by an audio sensor (e.g., V1 as illustrated in fig. 1), and the aircraft vibration signal for the predetermined period of time is acquired by a vibration signal sensor (e.g., V2 as illustrated in fig. 1). The information is then input to a server (e.g., S in fig. 1) deployed with a prompt broadcast algorithm for the aircraft attendant, where the server is capable of processing the input information with the prompt broadcast algorithm for the aircraft attendant to generate corrected audio data to be played.
Having described the basic principles of the present invention, various non-limiting embodiments of the present invention will now be described in detail with reference to the accompanying drawings.
Exemplary System: fig. 2 is a block diagram of an aircraft attendant cue-casting system according to an embodiment of the present invention. As shown in fig. 2, an aircraft attendant prompt broadcasting system 300 according to an embodiment of the present invention includes: a data acquisition module 310 to be played; a vibration feature extraction module 320; a multi-mode encoding module 330; an audio waveform feature extraction module 340; a feature fusion module 350; a countermeasure generation module 360; and a broadcast module 370.
The data to be played collection module 310 is configured to obtain audio data to be played in a predetermined period of time provided by an aircraft attendant, and an aircraft vibration signal in the predetermined period of time; the vibration feature extraction module 320 is configured to perform frequency domain feature analysis on the aircraft vibration signal to obtain a plurality of aircraft vibration frequency domain statistical features; the multi-mode encoding module 330 is configured to input the aircraft vibration signal and the plurality of aircraft vibration frequency domain statistics into a multi-mode joint encoder including an image encoder and a sequence encoder to obtain an aircraft vibration feature matrix; the audio waveform feature extraction module 340 is configured to pass the audio data to be played in the predetermined period of time through a convolutional neural network model serving as a feature extractor to obtain an audio waveform image feature matrix; the feature fusion module 350 is configured to fuse the aircraft vibration feature matrix and the audio waveform image feature matrix to obtain a fused feature matrix; the countermeasure generation module 360 is configured to pass the fusion feature matrix through a generator based on a countermeasure generation network to obtain corrected audio data to be played; and the broadcasting module 370 is configured to play the corrected audio data to be played.
Fig. 3 is a system architecture diagram of an aircraft attendant cue-play system according to an embodiment of the present invention. As shown in fig. 3, in the network architecture, first, audio data to be played for a predetermined period of time provided by an aircraft attendant and an aircraft vibration signal for the predetermined period of time are acquired through the data to be played acquisition module 310; then, the vibration feature extraction module 320 performs frequency domain feature analysis on the aircraft vibration signal acquired by the data acquisition module 310 to be played to obtain a plurality of aircraft vibration frequency domain statistical features; the multi-mode encoding module 330 inputs the aircraft vibration signals acquired by the data acquisition module 310 to be played and the plurality of aircraft vibration frequency domain statistical features acquired by the vibration feature extraction module 320 into a multi-mode joint encoder comprising an image encoder and a sequence encoder to acquire an aircraft vibration feature matrix; then, the audio waveform feature extraction module 340 passes the audio data to be played in the predetermined period acquired by the data to be played acquisition module 310 through a convolutional neural network model as a feature extractor to obtain an audio waveform image feature matrix; the feature fusion module 350 fuses the aircraft vibration feature matrix obtained by the multi-mode encoding module 330 and the audio waveform image feature matrix obtained by the audio waveform feature extraction module 340 to obtain a fused feature matrix; the countermeasure generation module 360 fuses the feature fusion module 350 to obtain corrected audio data to be played through a generator based on a countermeasure generation network; further, the playback module 370 plays back the corrected audio data to be played back.
Specifically, during operation of the alert broadcast system 300 for an aircraft attendant, the data to be played collection module 310 is configured to obtain audio data to be played for a predetermined period of time provided by the aircraft attendant, and an aircraft vibration signal for the predetermined period of time. It should be understood that in the actual playing process, the actually played audio data may deviate due to unstable vibration of the aircraft, so that the hearing of the passengers is adversely affected. More specifically, first, audio data to be played for a predetermined period of time provided by an aircraft attendant is acquired by an audio sensor, and an aircraft vibration signal for the predetermined period of time is acquired by a vibration signal sensor.
Specifically, during operation of the alert broadcast system 300 for an aircraft attendant, the vibration feature extraction module 320 is configured to perform frequency domain feature analysis on the aircraft vibration signal to obtain a plurality of aircraft vibration frequency domain statistical features. In the technical scheme of the invention, in order to extract the vibration characteristics of the aircraft in the preset time period more accurately, the frequency domain characteristic analysis is carried out on the aircraft vibration signals so as to obtain a plurality of aircraft vibration frequency domain statistical characteristics. In one specific example of the invention, the conversion of the time domain to the frequency domain of the aircraft vibration signal may be achieved by fourier transformation.
Specifically, during operation of the aircraft attendant prompt broadcast system 300, the multi-modal encoding module 330 is configured to input the aircraft vibration signal and the plurality of aircraft vibration frequency domain statistics into a multi-modal joint encoder including an image encoder and a sequence encoder to obtain an aircraft vibration feature matrix. That is, the aircraft vibration signal and the plurality of aircraft vibration frequency domain statistics are input to a multi-modal joint encoder comprising an image encoder and a sequence encoder to obtain an aircraft vibration feature matrix. Here, it should be noted that the plurality of aircraft vibration frequency domain statistics are discrete numbersAccording to the method, the aircraft vibration signals are two-dimensional waveform diagrams, and the two signals belong to data of different modes, so that in the technical scheme of the invention, the multimode joint encoder comprising the image encoder and the sequence encoder is used for extracting single-mode characteristics of the aircraft vibration signals and the plurality of aircraft vibration frequency domain statistical characteristics, and then carrying out mode characteristic fusion to obtain the aircraft vibration characteristic matrix. In a specific example of the present invention, the image encoder is a convolutional neural network model as a filter, and the sequence encoder is a multi-scale neighborhood feature extraction module, wherein the multi-scale neighborhood feature extraction module comprises: the device comprises a first convolution layer, a second convolution layer parallel to the first convolution layer and a multi-scale feature fusion layer connected with the first convolution layer and the second convolution layer, wherein the first convolution layer uses a one-dimensional convolution kernel with a first length, and the second convolution layer uses a one-dimensional convolution kernel with a second length. Specifically, in one example, first, the plurality of aircraft vibration frequency domain statistical features are input to the sequence encoder to obtain an aircraft vibration frequency domain statistical feature vector; more specifically, inputting the aircraft vibration frequency domain statistical feature into a first convolution layer of the multi-scale neighborhood feature extraction module to obtain a first neighborhood scale aircraft vibration frequency domain statistical feature vector, wherein the first convolution layer has a first one-dimensional convolution kernel with a first length; inputting the aircraft vibration frequency domain statistical features into a second convolution layer of the multi-scale neighborhood feature extraction module to obtain a second neighborhood scale aircraft vibration frequency domain statistical feature vector, wherein the second convolution layer is provided with a second one-dimensional convolution kernel with a second length, and the first length is different from the second length; and cascading the first neighborhood scale aircraft vibration frequency domain statistical feature vector with the second neighborhood scale aircraft vibration frequency domain statistical feature vector to obtain the aircraft vibration frequency domain statistical feature vector. Inputting the plurality of aircraft vibration frequency domain statistical features into a first convolution layer of the multi-scale neighborhood feature extraction module to obtain a first neighborhood scale aircraft vibration frequency domain statistical feature vector, including: using the multi-scale neighborhood feature The first convolution layer of the extraction module carries out one-dimensional convolution coding on the aircraft vibration frequency domain statistical characteristics according to the following one-dimensional convolution formula so as to obtain a first neighborhood scale aircraft vibration frequency domain statistical characteristic vector; wherein the one-dimensional convolution formula is:
Figure SMS_34
wherein (1)>
Figure SMS_36
For the first convolution kernel at->
Figure SMS_39
Width in direction, ++>
Figure SMS_33
For the first convolution kernel parameter vector, +.>
Figure SMS_35
For a local vector matrix operating with a convolution kernel function, < ->
Figure SMS_40
For the size of the first convolution kernel, +.>
Figure SMS_43
Representing the statistical characteristics of the vibration frequency domain of the aircraft, < >>
Figure SMS_31
Representing one-dimensional convolution encoding of the aircraft vibration frequency domain statistical characteristics; inputting the aircraft vibration frequency domain statistical feature into a second convolution layer of the multi-scale neighborhood feature extraction module to obtain a second neighborhood scale aircraft vibration frequency domain statistical feature vector, including: performing one-dimensional convolution coding on the aircraft vibration frequency domain statistical features by using a second convolution layer of the multi-scale neighborhood feature extraction module according to the following one-dimensional convolution formula to obtain second neighborhood scale aircraft vibration frequency domain statistical feature vectors; wherein the one-dimensional convolution formula is: wherein (1)>
Figure SMS_38
For a second convolutionNuclear->
Figure SMS_42
Width in direction, ++>
Figure SMS_44
For a second convolution kernel parameter vector, +. >
Figure SMS_32
For a local vector matrix operating with a convolution kernel function, < ->
Figure SMS_37
For the size of the second convolution kernel,Xrepresenting the statistical characteristics of the vibration frequency domain of the aircraft, < >>
Figure SMS_41
And representing one-dimensional convolution encoding of the aircraft vibration frequency domain statistical characteristics. Secondly, inputting the aircraft vibration signal into the image encoder to obtain an aircraft vibration waveform characteristic vector; more specifically, each layer of the convolutional neural network model using the filter performs, in forward transfer of the layer, input data: carrying out convolution processing on input data to obtain a convolution characteristic diagram; pooling the convolution feature images based on a feature matrix to obtain pooled feature images; performing nonlinear activation on the pooled feature map to obtain an activated feature map; the output of the last layer of the convolutional neural network serving as the filter is the aircraft vibration waveform characteristic vector, and the input of the first layer of the convolutional neural network serving as the filter is the aircraft vibration signal.
Fig. 4 is a block diagram of a multi-modal encoding module in an aircraft attendant hint broadcast system according to an embodiment of the present invention. As shown in fig. 4, the multi-mode encoding module 330 includes: a frequency domain feature extraction unit 331, configured to input the plurality of aircraft vibration frequency domain statistical features into the sequence encoder to obtain an aircraft vibration frequency domain statistical feature vector; a vibration waveform feature extraction unit 332, configured to input the aircraft vibration signal to the image encoder to obtain an aircraft vibration waveform feature vector; and a joint optimization unit 333, configured to calculate a vector product between the transpose vector of the aircraft vibration waveform feature vector and the aircraft vibration frequency domain statistical feature vector to obtain the audio waveform image feature matrix.
Specifically, during the operation of the alert broadcasting system 300 for an aircraft attendant, the audio waveform feature extraction module 340 is configured to pass the audio data to be played in the predetermined period of time through a convolutional neural network model serving as a feature extractor to obtain an audio waveform image feature matrix. That is, the audio data to be played is regarded as a two-dimensional waveform diagram, and a convolutional neural network with excellent performance in the field of image feature extraction is further used for extracting high-dimensional image hidden features in the audio data to be played in the preset time period to obtain an audio waveform image feature matrix. In a specific example, the convolutional neural network as the feature extractor includes a plurality of neural network layers cascaded with each other, wherein each neural network layer includes a convolutional layer, a pooling layer, and an activation layer. In the encoding process of the convolutional neural network serving as the feature extractor, each layer of the convolutional neural network serving as the feature extractor performs convolution processing based on convolution kernel on input data by using the convolutional layer in the forward transfer process of the layer, performs pooling processing along the channel dimension on the convolutional feature map output by the convolutional layer by using the pooling layer, and performs activation processing on the pooled feature map output by the pooling layer by using the activation layer. More specifically, the output of the last layer of the convolutional neural network as the feature extractor is the audio waveform image feature matrix, and the input of the first layer of the convolutional neural network as the feature extractor is the audio data to be played for the predetermined period of time.
Specifically, during the operation of the alert broadcast system 300 for an aircraft attendant, the feature fusion module 350 is configured to fuse the aircraft vibration feature matrix and the audio waveform image feature matrix to obtain a fused feature matrix. That is, after the aircraft vibration feature matrix and the audio waveform image feature matrix are obtained, the aircraft vibration is fused in a high-dimensional feature spaceThe characteristic matrix and the audio waveform image characteristic matrix are used for obtaining a fusion characteristic matrix containing aircraft vibration characteristics and audio waveform characteristics. Here, when the aircraft vibration feature matrix and the audio waveform image feature matrix are fused, for example, by a point-adding manner, to obtain a fused feature matrix, since the aircraft vibration feature matrix and the audio waveform image feature matrix express vibration frequency domain association features and audio waveform semantic features, respectively, if the correlation between the overall feature distribution of the aircraft vibration feature matrix and the audio waveform image feature matrix and the consistency of probability density distribution can be improved, the feature expression effect of the fused feature matrix can be improved by improving the fusion effect of the aircraft vibration feature matrix and the audio waveform image feature matrix, so that the quality of the corrected audio data to be played generated by the fused feature matrix can be improved. The applicant of the present invention therefore first expands the aircraft vibration feature matrix and the audio waveform image feature matrix into aircraft vibration feature vectors, e.g., denoted as
Figure SMS_57
And an audio waveform image feature vector, e.g., denoted +.>
Figure SMS_47
Calculating the vibration characteristic vector of the airplane>
Figure SMS_51
And the audio waveform image feature vector +.>
Figure SMS_60
Affine mapping factors of the association-probability density distribution expressed as: />
Figure SMS_63
Figure SMS_61
Wherein->
Figure SMS_64
Representing the vibration characteristics of said aircraftVector (S)>
Figure SMS_53
Representing the audio waveform image feature vector, +.>
Figure SMS_56
A correlation matrix obtained for position-by-position correlation between the aircraft vibration feature vector and the audio waveform image feature vector,>
Figure SMS_45
and->
Figure SMS_49
Is the mean vector and covariance matrix of the Gaussian density map formed by the aircraft vibration characteristic vector and the audio waveform image characteristic vector,/A>
Figure SMS_48
Representing matrix multiplication, representing->
Figure SMS_50
An exponential operation representing a matrix representing the calculation of a natural exponential function value raised to a power by a characteristic value at each position in the matrix,/v>
Figure SMS_55
Affine mapping factors representing said first correlation-probability density distribution,>
Figure SMS_59
representing the second associative-probability density distribution affine mapping factor. That is, by constructing the aircraft vibration feature vector +.>
Figure SMS_54
And the audio waveform image feature vector +.>
Figure SMS_58
The associated feature space and the probability density space represented by Gaussian probability density can be obtained by fitting the aircraft vibration feature vector +. >
Figure SMS_62
And the audio waveform image feature vector +.>
Figure SMS_65
Mapping into affine homography subspaces within an associated feature space and a probability density space, respectively, to extract affine homography-compliant representations of feature representations within an associated feature domain and a probability density domain by distributing affine mapping factor values +.>
Figure SMS_46
And->
Figure SMS_52
The aircraft vibration feature matrix and the audio waveform image feature matrix are weighted respectively, so that the relevance of the feature representation of the aircraft vibration feature matrix and the audio waveform image feature matrix and the consistency of the probability density distribution can be improved, the feature expression effect of the fusion feature matrix obtained through fusion is improved, and the quality of audio data to be played after correction generated by the fusion feature matrix is improved.
Fig. 5 is a block diagram of a feature fusion module in an aircraft attendant hint broadcasting system according to an embodiment of the present invention. As shown in fig. 5, the feature fusion module 350 includes: a feature matrix expansion unit 351 configured to expand the aircraft vibration feature matrix and the audio waveform image feature matrix into an aircraft vibration feature vector and an audio waveform image feature vector; an affine mapping factor calculating unit 352 for calculating a correlation-probability density distribution affine mapping factor between the aircraft vibration feature vector and the audio waveform image feature vector to obtain a first correlation-probability density distribution affine mapping factor and a second correlation-probability density distribution affine mapping factor; a fusion unit 353 for calculating a position weighted sum between the aircraft vibration feature matrix and the audio waveform image feature matrix with the first correlation-probability density distribution affine mapping factor and the second correlation-probability density distribution affine mapping factor as weights to obtain the fusion feature matrix.
Specifically, during operation of the aircraft attendant prompt broadcast system 300, the countermeasure generation module 360 is configured to pass the fusion feature matrix through a countermeasure generation network-based generator to obtain corrected audio data to be played. That is, the corrected audio data to be played corresponding to the fusion feature matrix is fitted based on the countermeasure generation idea. It will be appreciated by those of ordinary skill in the art that the countermeasure generation network includes a discriminator and a generator, wherein during training, the generator is configured to generate corrected audio data to be played, and the discriminator is configured to measure a difference between the corrected audio data to be played and the actual corrected audio data to be played to obtain a discriminator loss function value, and update neural network parameters of the generator with the discriminator loss function value as the loss function value and by a gradient-decreasing back propagation algorithm such that the corrected audio data to be played generated by the generator is approximated to the actual corrected audio data to be played, in such a manner that the quality of the corrected audio data to be played is improved.
Specifically, during operation of the aircraft attendant prompt broadcasting system 300, the broadcasting module 370 is configured to play the corrected audio data to be played. That is, in the technical scheme of the invention, before the audio data to be played is played, the audio data to be played is adaptively corrected based on the vibration condition of the aircraft so as to pre-compensate the audio signal before the audio signal is transmitted, and in this way, the transmission offset of the audio signal caused by the fluctuation of the aircraft is avoided so as to ensure that the sound signal can be transmitted to the ears of the passengers in a fidelity manner.
In summary, the flight attendant prompt broadcast system 300 according to the embodiment of the present invention is illustrated, which further adaptively corrects the audio data to be broadcast based on the vibration condition of the aircraft by fusing the vibration frequency domain related features and the audio waveform semantic features in the broadcast process by using the neural network model based on the deep learning, so as to pre-compensate the audio signal before the audio signal is propagated, in this way, propagation offset of the audio signal caused by the aircraft fluctuation is avoided to ensure that the audio signal can be transmitted to the ears of the passengers with fidelity.
As described above, the flight attendant cue broadcasting system according to the embodiment of the present invention can be implemented in various terminal devices. In one example, the aircraft attendant reminder broadcast system 300 according to embodiments of the present invention may be integrated into the terminal device as a software module and/or hardware module. For example, the aircraft attendant reminder broadcast system 300 may be a software module in the operating system of the terminal device or may be an application developed for the terminal device; of course, the aircraft attendant prompt broadcast system 300 could equally be one of the plurality of hardware modules of the terminal device.
Alternatively, in another example, the aircraft attendant reminder-to-broadcast system 300 and the terminal device may be separate devices, and the aircraft attendant reminder-to-broadcast system 300 may be connected to the terminal device via a wired and/or wireless network and communicate the interactive information in a agreed data format.
An exemplary method is: fig. 6 is a flowchart of an aircraft attendant cue-casting method according to an embodiment of the present invention. As shown in fig. 6, the method for broadcasting a prompt for an aircraft attendant according to an embodiment of the present invention includes the steps of: s110, acquiring audio data to be played in a preset time period provided by an aircraft attendant, and an aircraft vibration signal in the preset time period; s120, performing frequency domain feature analysis on the aircraft vibration signals to obtain a plurality of aircraft vibration frequency domain statistical features; s130, inputting the aircraft vibration signals and the plurality of aircraft vibration frequency domain statistical characteristics into a multi-mode joint encoder comprising an image encoder and a sequence encoder to obtain an aircraft vibration characteristic matrix; s140, passing the audio data to be played in the preset time period through a convolutional neural network model serving as a feature extractor to obtain an audio waveform image feature matrix; s150, fusing the aircraft vibration feature matrix and the audio waveform image feature matrix to obtain a fused feature matrix; s160, the fusion feature matrix passes through a generator based on a countermeasure generation network to obtain corrected audio data to be played; and S170, playing the corrected audio data to be played.
In one example, in the above-mentioned alert broadcasting method for an aircraft attendant, the step S130 includes: inputting the plurality of aircraft vibration frequency domain statistical features into the sequence encoder to obtain an aircraft vibration frequency domain statistical feature vector; inputting the aircraft vibration signal into the image encoder to obtain an aircraft vibration waveform feature vector; and calculating a vector product between the transpose vector of the aircraft vibration waveform feature vector and the aircraft vibration frequency domain statistical feature vector to obtain the audio waveform image feature matrix. The image encoder is a convolutional neural network model serving as a filter, and the sequence encoder is a multi-scale neighborhood feature extraction module, wherein the multi-scale neighborhood feature extraction module comprises: the device comprises a first convolution layer, a second convolution layer parallel to the first convolution layer and a multi-scale feature fusion layer connected with the first convolution layer and the second convolution layer, wherein the first convolution layer uses a one-dimensional convolution kernel with a first length, and the second convolution layer uses a one-dimensional convolution kernel with a second length.
In one example, in the above-mentioned alert broadcasting method for an aircraft attendant, the step S140 includes: each layer of the convolutional neural network model using the feature extractor performs, in forward transfer of the layer, input data: carrying out convolution processing on input data to obtain a convolution characteristic diagram; pooling the convolution feature map along a channel dimension to obtain a pooled feature map; performing nonlinear activation on the pooled feature map to obtain an activated feature map; the output of the last layer of the convolutional neural network serving as the feature extractor is the audio waveform image feature matrix, and the input of the first layer of the convolutional neural network serving as the feature extractor is the audio data to be played in the preset time period.
In one example, in the above-mentioned alert broadcasting method for an aircraft attendant, the step S150 includes: expanding the aircraft vibration feature matrix and the audio waveform image feature matrix into aircraft vibration feature vectors and audio waveform image feature vectors; calculating a correlation-probability density distribution affine mapping factor between the aircraft vibration feature vector and the audio waveform image feature vector to obtain a first correlation-probability density distribution affine mapping factor and a second correlation-probability density distribution affine mapping factor; and calculating a weighted sum of the aircraft vibration feature matrix and the audio waveform image feature matrix according to positions by taking the affine mapping factors of the first association-probability density distribution and the affine mapping factors of the second association-probability density distribution as weights so as to obtain the fusion feature matrix. Wherein calculating an association-probability density distribution affine mapping factor between the aircraft vibration feature vector and the audio waveform image feature vector to obtain a first association-probability density distribution affine mapping factor and a second association-probability density distribution affine mapping factor comprises: calculating a correlation-probability density distribution affine mapping factor of the aircraft vibration feature vector and the audio waveform image feature vector in the following optimization formula to obtain a first correlation-probability density distribution affine mapping factor and a second correlation-probability density distribution affine mapping factor; wherein, the optimization formula is:
Figure SMS_66
Figure SMS_67
Wherein the method comprises the steps of
Figure SMS_70
Representing the aircraft vibration feature vector, +.>
Figure SMS_72
Representing the audio waveform image feature vector, +.>
Figure SMS_75
For each of the aircraft vibration feature vector and the audio waveform image feature vectorThe correlation matrix obtained by the position correlation,
Figure SMS_69
and->
Figure SMS_71
Is the mean vector and covariance matrix of the Gaussian density map formed by the aircraft vibration characteristic vector and the audio waveform image characteristic vector,/A>
Figure SMS_74
Representing matrix multiplication, representing->
Figure SMS_76
An exponential operation representing a matrix representing the calculation of a natural exponential function value raised to a power by a characteristic value at each position in the matrix,/v>
Figure SMS_68
Affine mapping factors representing said first correlation-probability density distribution,>
Figure SMS_73
representing the second associative-probability density distribution affine mapping factor.
In one example, in the above-mentioned alert broadcasting method for an aircraft attendant, the step S160 includes: the countermeasure generation network includes a generator and a discriminator.
In summary, the method for prompting broadcasting by the aircraft crews according to the embodiment of the invention is explained, which further adaptively corrects the audio data to be broadcasted based on the vibration condition of the aircraft by adopting the neural network model based on deep learning to fuse the vibration frequency domain related characteristics and the audio waveform semantic characteristics in the broadcasting process so as to pre-compensate the audio signal before the audio signal is propagated, and in this way, propagation offset of the audio signal caused by the fluctuation of the aircraft is avoided so as to ensure that the audio signal can be transmitted to the ears of passengers with fidelity.
Exemplary electronic device: next, an electronic device according to an embodiment of the present invention is described with reference to fig. 7.
Fig. 7 illustrates a block diagram of an electronic device according to an embodiment of the invention.
As shown in fig. 7, the electronic device 10 includes one or more processors 11 and a memory 12.
The processor 11 may be a Central Processing Unit (CPU) or other form of processing unit having data processing and/or instruction execution capabilities, and may control other components in the electronic device 10 to perform desired functions.
Memory 12 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, random Access Memory (RAM) and/or cache memory (cache), and the like. The non-volatile memory may include, for example, read Only Memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on the computer readable storage medium that may be executed by the processor 11 to implement the functions in the aircraft attendant reminder broadcast system and/or other desired functions of the various embodiments of the present invention described above. Various content, such as aircraft vibration frequency domain statistics, may also be stored in the computer readable storage medium.
In one example, the electronic device 10 may further include: an input device 13 and an output device 14, which are interconnected by a bus system and/or other forms of connection mechanisms (not shown).
The input means 13 may comprise, for example, a keyboard, a mouse, etc.
The output device 14 may output various information to the outside, including audio data to be played after correction, and the like. The output means 14 may include, for example, a display, speakers, a printer, and a communication network and remote output devices connected thereto, etc.
Of course, only some of the components of the electronic device 10 that are relevant to the present invention are shown in fig. 7 for simplicity, components such as buses, input/output interfaces, etc. are omitted. In addition, the electronic device 10 may include any other suitable components depending on the particular application.
Exemplary computer program product and computer readable storage Medium: in addition to the methods and apparatus described above, embodiments of the invention may also be a computer program product comprising computer program instructions which, when executed by a processor, cause the processor to perform steps in the functions of the aircraft attendant prompt broadcast method according to various embodiments of the invention described in the "exemplary systems" section of this specification.
The computer program product may write program code for performing operations of embodiments of the present invention in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server.
Furthermore, embodiments of the present invention may also be a computer-readable storage medium, having stored thereon computer program instructions which, when executed by a processor, cause the processor to perform steps in the functions of the aircraft crew alerting broadcasting method according to various embodiments of the present invention described in the above-mentioned "exemplary systems" section of the present specification.
The computer readable storage medium may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may include, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The basic principles of the present invention have been described above in connection with specific embodiments, however, it should be noted that the advantages, benefits, effects, etc. mentioned in the present invention are merely examples and not intended to be limiting, and these advantages, benefits, effects, etc. are not to be considered as essential to the various embodiments of the present invention. Furthermore, the specific details disclosed herein are for purposes of illustration and understanding only, and are not intended to be limiting, as the invention is not necessarily limited to practice with the above described specific details.
The block diagrams of the devices, apparatuses, devices, systems referred to in the present invention are only illustrative examples and are not intended to require or imply that the connections, arrangements, configurations must be made in the manner shown in the block diagrams. As will be appreciated by one of skill in the art, the devices, apparatuses, devices, systems may be connected, arranged, configured in any manner. Words such as "including," "comprising," "having," and the like are words of openness and mean "including but not limited to," and are used interchangeably therewith. The terms "or" and "as used herein refer to and are used interchangeably with the term" and/or "unless the context clearly indicates otherwise. The term "such as" as used herein refers to, and is used interchangeably with, the phrase "such as, but not limited to.
It is also noted that in the apparatus, devices and methods of the present invention, the components or steps may be disassembled and/or assembled. Such decomposition and/or recombination should be considered as equivalent aspects of the present invention.
The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the invention. Thus, the present invention is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
The foregoing description has been presented for purposes of illustration and description. Furthermore, this description is not intended to limit embodiments of the invention to the form disclosed herein. Although a number of example aspects and embodiments have been discussed above, a person of ordinary skill in the art will recognize certain variations, modifications, alterations, additions, and subcombinations thereof.

Claims (10)

1. An aircraft attendant cue-casting system, comprising: the system comprises a data acquisition module to be played, a data processing module and a data processing module, wherein the data acquisition module to be played is used for acquiring audio data to be played in a preset time period provided by an aircraft attendant and an aircraft vibration signal in the preset time period; the vibration characteristic extraction module is used for carrying out frequency domain characteristic analysis on the aircraft vibration signals so as to obtain a plurality of aircraft vibration frequency domain statistical characteristics; the multi-mode coding module is used for inputting the aircraft vibration signals and the plurality of aircraft vibration frequency domain statistical characteristics into a multi-mode joint encoder comprising an image encoder and a sequence encoder so as to obtain an aircraft vibration characteristic matrix; the audio waveform feature extraction module is used for enabling the audio data to be played in the preset time period to pass through a convolutional neural network model serving as a feature extractor so as to obtain an audio waveform image feature matrix; the feature fusion module is used for fusing the aircraft vibration feature matrix and the audio waveform image feature matrix to obtain a fused feature matrix; the countermeasure generation module is used for enabling the fusion feature matrix to pass through a generator based on a countermeasure generation network to obtain corrected audio data to be played; and the broadcasting module is used for broadcasting the corrected audio data to be played.
2. The aircraft attendant cue broadcasting system of claim 1, wherein the image encoder is a convolutional neural network model as a filter, and the sequence encoder is a multi-scale neighborhood feature extraction module, wherein the multi-scale neighborhood feature extraction module comprises: the device comprises a first convolution layer, a second convolution layer parallel to the first convolution layer and a multi-scale feature fusion layer connected with the first convolution layer and the second convolution layer, wherein the first convolution layer uses a one-dimensional convolution kernel with a first length, and the second convolution layer uses a one-dimensional convolution kernel with a second length.
3. The aircraft attendant cue-casting system of claim 2, wherein the multi-modality encoding module comprises: the frequency domain feature extraction unit is used for inputting the plurality of aircraft vibration frequency domain statistical features into the sequence encoder to obtain aircraft vibration frequency domain statistical feature vectors; the vibration waveform characteristic extraction unit is used for inputting the aircraft vibration signal into the image encoder to obtain an aircraft vibration waveform characteristic vector; and the joint optimization unit is used for calculating a vector product between the transpose vector of the aircraft vibration waveform characteristic vector and the aircraft vibration frequency domain statistical characteristic vector to obtain the audio waveform image characteristic matrix.
4. The aircraft attendant cue-casting system of claim 3, wherein the audio waveform feature extraction module is configured to: each layer of the convolutional neural network model using the feature extractor performs, in forward transfer of the layer, input data: carrying out convolution processing on input data to obtain a convolution characteristic diagram; pooling the convolution feature map along a channel dimension to obtain a pooled feature map; non-linear activation is carried out on the pooled feature map so as to obtain an activated feature map; the output of the last layer of the convolutional neural network serving as the feature extractor is the audio waveform image feature matrix, and the input of the first layer of the convolutional neural network serving as the feature extractor is the audio data to be played in the preset time period.
5. The aircraft attendant cue-casting system of claim 4, wherein the feature fusion module comprises: the characteristic matrix unfolding unit is used for unfolding the aircraft vibration characteristic matrix and the audio waveform image characteristic matrix into aircraft vibration characteristic vectors and audio waveform image characteristic vectors; an affine mapping factor calculation unit for calculating a correlation-probability density distribution affine mapping factor between the aircraft vibration feature vector and the audio waveform image feature vector to obtain a first correlation-probability density distribution affine mapping factor and a second correlation-probability density distribution affine mapping factor; and the fusion unit is used for calculating the weighted sum according to the position between the aircraft vibration characteristic matrix and the audio waveform image characteristic matrix by taking the affine mapping factors of the first association-probability density distribution and the affine mapping factors of the second association-probability density distribution as weights so as to obtain the fusion characteristic matrix.
6. The aircraft attendant cue broadcasting system of claim 5, wherein the affine mapping factor calculating unit is configured to: calculating a correlation-probability density distribution affine mapping factor of the aircraft vibration feature vector and the audio waveform image feature vector in the following optimization formula to obtain a first correlation-probability density distribution affine mapping factor and a second correlation-probability density distribution affine mapping factor; wherein, the optimization formula is:
Figure QLYQS_3
Figure QLYQS_5
wherein->
Figure QLYQS_7
Representing the aircraft vibration feature vector, +.>
Figure QLYQS_1
Representing the audio waveform image feature vector, +.>
Figure QLYQS_6
For the space between the aircraft vibration characteristic vector and the audio waveform image characteristic vectorPosition-by-position correlation obtained correlation matrix, +.>
Figure QLYQS_8
And->
Figure QLYQS_10
Is the mean vector and covariance matrix of the Gaussian density map formed by the aircraft vibration characteristic vector and the audio waveform image characteristic vector,/A>
Figure QLYQS_2
Representing matrix multiplication, representing->
Figure QLYQS_4
An exponential operation representing a matrix representing the calculation of a natural exponential function value raised to a power by a characteristic value at each position in the matrix,/v>
Figure QLYQS_9
Representing the first associative-probability density distribution affine mapping factor,
Figure QLYQS_11
representing the second associative-probability density distribution affine mapping factor.
7. The aircraft attendant cue-casting system of claim 6, wherein the countermeasure generation network comprises a generator and a discriminator.
8. An aircraft attendant cue broadcasting method, comprising: acquiring audio data to be played for a predetermined period of time provided by an aircraft attendant, and an aircraft vibration signal for the predetermined period of time; performing frequency domain feature analysis on the aircraft vibration signals to obtain a plurality of aircraft vibration frequency domain statistical features; inputting the aircraft vibration signals and the plurality of aircraft vibration frequency domain statistical features into a multi-mode joint encoder comprising an image encoder and a sequence encoder to obtain an aircraft vibration feature matrix; the audio data to be played in the preset time period passes through a convolutional neural network model serving as a feature extractor to obtain an audio waveform image feature matrix; fusing the aircraft vibration feature matrix and the audio waveform image feature matrix to obtain a fused feature matrix; the fusion feature matrix passes through a generator based on a countermeasure generation network to obtain corrected audio data to be played; and playing the corrected audio data to be played.
9. The method of claim 8, wherein fusing the aircraft vibration feature matrix and the audio waveform image feature matrix to obtain a fused feature matrix, comprising: expanding the aircraft vibration feature matrix and the audio waveform image feature matrix into aircraft vibration feature vectors and audio waveform image feature vectors; calculating a correlation-probability density distribution affine mapping factor between the aircraft vibration feature vector and the audio waveform image feature vector to obtain a first correlation-probability density distribution affine mapping factor and a second correlation-probability density distribution affine mapping factor; and calculating a weighted sum of the aircraft vibration feature matrix and the audio waveform image feature matrix according to positions by taking the affine mapping factors of the first association-probability density distribution and the affine mapping factors of the second association-probability density distribution as weights so as to obtain the fusion feature matrix.
10. An aircraft attendant cue-casting method as claimed in claim 9, wherein the countermeasure generation network includes a generator and a discriminator.
CN202310308003.0A 2023-03-28 2023-03-28 Prompt broadcasting system for aircraft crews and method thereof Withdrawn CN116030830A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310308003.0A CN116030830A (en) 2023-03-28 2023-03-28 Prompt broadcasting system for aircraft crews and method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310308003.0A CN116030830A (en) 2023-03-28 2023-03-28 Prompt broadcasting system for aircraft crews and method thereof

Publications (1)

Publication Number Publication Date
CN116030830A true CN116030830A (en) 2023-04-28

Family

ID=86077903

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310308003.0A Withdrawn CN116030830A (en) 2023-03-28 2023-03-28 Prompt broadcasting system for aircraft crews and method thereof

Country Status (1)

Country Link
CN (1) CN116030830A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117116289A (en) * 2023-10-24 2023-11-24 吉林大学 Medical intercom management system for ward and method thereof

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117116289A (en) * 2023-10-24 2023-11-24 吉林大学 Medical intercom management system for ward and method thereof
CN117116289B (en) * 2023-10-24 2023-12-26 吉林大学 Medical intercom management system for ward and method thereof

Similar Documents

Publication Publication Date Title
JP6355800B1 (en) Learning device, generating device, learning method, generating method, learning program, and generating program
KR102195627B1 (en) Apparatus and method for generating translation model, apparatus and method for automatic translation
CN108510982B (en) Audio event detection method and device and computer readable storage medium
CN109981787B (en) Method and device for displaying information
CN111309883A (en) Man-machine conversation method based on artificial intelligence, model training method and device
CN110795912B (en) Method, device, equipment and storage medium for encoding text based on neural network
Laraba et al. Dance performance evaluation using hidden Markov models
CN110072140B (en) Video information prompting method, device, equipment and storage medium
CN116030830A (en) Prompt broadcasting system for aircraft crews and method thereof
US20200074180A1 (en) Accurate correction of errors in text data based on learning via a neural network
CN112364144B (en) Interaction method, device, equipment and computer readable medium
CN114330236A (en) Character generation method and device, electronic equipment and storage medium
US20150235643A1 (en) Interactive server and method for controlling the server
CN115967833A (en) Video generation method, device and equipment meter storage medium
CN114913590B (en) Data emotion recognition method, device and equipment and readable storage medium
KR20200095947A (en) Electronic device and Method for controlling the electronic device thereof
CN114154520B (en) Training method of machine translation model, machine translation method, device and equipment
CN110570877B (en) Sign language video generation method, electronic device and computer readable storage medium
CN117033600A (en) Generative role engine for cognitive entity synthesis
CN113643706B (en) Speech recognition method, device, electronic equipment and storage medium
CN114187173A (en) Model training method, image processing method and device, electronic device and medium
KR102379730B1 (en) Learning method of conversation agent system and apparatus
CN115428013A (en) Information processing apparatus and program
CN112562733A (en) Media data processing method and device, storage medium and computer equipment
KR20160062588A (en) Method and Apparatus for user adaptive recognition of voice command using network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20230428

WW01 Invention patent application withdrawn after publication