CN114418038A - Space-based information classification method and device based on multi-mode fusion and electronic equipment - Google Patents

Space-based information classification method and device based on multi-mode fusion and electronic equipment Download PDF

Info

Publication number
CN114418038A
CN114418038A CN202210317228.8A CN202210317228A CN114418038A CN 114418038 A CN114418038 A CN 114418038A CN 202210317228 A CN202210317228 A CN 202210317228A CN 114418038 A CN114418038 A CN 114418038A
Authority
CN
China
Prior art keywords
information
text
picture
fusion
space
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210317228.8A
Other languages
Chinese (zh)
Inventor
刘禹汐
姜青涛
侯立旺
王慧静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Daoda Tianji Technology Co ltd
Original Assignee
Beijing Daoda Tianji Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Daoda Tianji Technology Co ltd filed Critical Beijing Daoda Tianji Technology Co ltd
Priority to CN202210317228.8A priority Critical patent/CN114418038A/en
Publication of CN114418038A publication Critical patent/CN114418038A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks

Abstract

The embodiment of the disclosure provides a multi-mode fusion-based space-based information classification method and device and electronic equipment. The method comprises the following steps: respectively obtaining text information characteristics and picture information characteristics of the space-based information to be classified; extracting corresponding text characteristic vectors and corresponding picture characteristic vectors according to the text information characteristics and the picture information characteristics; calculating the text characteristic vector and the picture characteristic vector, and performing joint splicing on the text characteristic vector and the picture characteristic vector with correlation to obtain fusion characteristics of the sky-based information to be classified; and inputting the fusion characteristics into a preset classification model for classification. In this way, the multi-mode information can be reasonably processed to obtain rich characteristic information, reasonable classification is carried out according to interaction between the characteristics, the information classification efficiency is improved, and subsequent work such as quick search of information is facilitated.

Description

Space-based information classification method and device based on multi-mode fusion and electronic equipment
Technical Field
The present disclosure relates to data classification technology, and more particularly, to information classification technology.
Background
At present, the explosive growth and accessibility of space-flight open-source multi-mode data on the Internet provide wide opportunities for people, and the intrinsic knowledge of heterogeneous information can be fused from multiple aspects, so that the traditional information text classification technology is challenged. For the research on the classification problem of the open source information of the space-based services, the existing technical scheme and research are mostly based on texts, and the information is classified by utilizing a natural processing technology.
The existing space-based open-source information classification technology is based on a single mode of text, picture multi-source information in open-source information is not fused, the reliability of the single-source information is not high, the space-based information is long in content and contains a large number of proper nouns, and the information classification effect is influenced.
Disclosure of Invention
The disclosure provides a space-based information classification method and device based on multi-mode fusion and electronic equipment.
According to a first aspect of the present disclosure, a space-based intelligence classification method based on multimodal fusion is provided. The method comprises the following steps:
respectively obtaining text information characteristics and picture information characteristics of the space-based information to be classified;
extracting corresponding text characteristic vectors and corresponding picture characteristic vectors according to the text information characteristics and the picture information characteristics;
calculating the text characteristic vector and the picture characteristic vector, and performing joint splicing on the text characteristic vector and the picture characteristic vector with correlation to obtain fusion characteristics of the sky-based information to be classified;
and inputting the fusion characteristics into a preset classification model for classification.
In some implementations of the first aspect, extracting the corresponding text feature vector from the text intelligence feature comprises:
the method comprises the steps of obtaining a text in space-based information to be classified, carrying out vectorization representation on the text, inputting the vectorization text into a pre-trained text information feature extraction model, and obtaining a corresponding text feature vector.
In some implementations of the first aspect, extracting the corresponding picture feature vector according to the picture intelligence feature comprises:
and obtaining a picture in the space-based information to be classified, and inputting the picture into a pre-trained picture information feature extraction model to obtain a corresponding picture feature vector.
In some realizations of the first aspect, the pre-trained text intelligence feature extraction model is a Bi-GRU model;
the pre-trained image information feature extraction model is a VGG-16 model and comprises 13 convolutional layers, 5 pooling layers and 2 full-connection layers.
In some implementation manners of the first aspect, the calculating the text feature vector and the picture feature vector, and jointly splicing the text feature vector and the picture feature vector having correlation to obtain the fusion feature of the sky-based information to be classified includes:
carrying out similarity calculation on the text information characteristic and the picture information characteristic according to a Pearson correlation coefficient; if the similarity reaches a threshold value, performing joint splicing; and if the similarity does not reach the threshold value, the joint splicing is not carried out.
In some implementations of the first aspect, the classification model is an MLP model;
the MLP model comprises a hidden layer, and the hidden layer uses a dropout algorithm;
the output layer of the MLP model comprises a softmax classifier that employs a multi-class cross-entropy loss function for classification.
According to a second aspect of the present disclosure, a space-based intelligence classification apparatus based on multimodal fusion is provided. The device includes:
the acquisition module is used for respectively acquiring text information characteristics and picture information characteristics of the space-based information to be classified;
the feature extraction module is used for extracting corresponding text feature vectors and picture feature vectors according to the text information features and the picture information features;
the first fusion module is used for calculating the text characteristic vector and the picture characteristic vector, and jointly splicing the text characteristic vector with correlation and the picture characteristic vector to obtain fusion characteristics of the sky-based information to be classified;
and the second fusion module is used for inputting the fusion characteristics into a preset classification model for classification.
According to a third aspect of the present disclosure, an electronic device is provided. The electronic device includes: a memory having a computer program stored thereon and a processor implementing the method as described above when executing the program.
In the method, the text information characteristics and the picture information characteristics are fused to classify the information, and because the expression modes of different modes are different, certain phenomena of intersection and complementation exist, even multiple different information interactions possibly exist among the modes.
It should be understood that the statements herein reciting aspects are not intended to limit the critical or essential features of the embodiments of the present disclosure, nor are they intended to limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The above and other features, advantages and aspects of various embodiments of the present disclosure will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. The accompanying drawings are included to provide a further understanding of the present disclosure, and are not intended to limit the disclosure thereto, and the same or similar reference numerals will be used to indicate the same or similar elements, where:
FIG. 1 shows a flow diagram of a multi-modal fusion-based space-based intelligence classification method according to an embodiment of the present disclosure;
FIG. 2 illustrates a logic diagram of a multi-modal fusion based space-based intelligence classification method according to an embodiment of the present disclosure;
FIG. 3 is a diagram illustrating the process of extracting text feature vectors;
FIG. 4 is a diagram illustrating the process of extracting feature vectors of a picture;
FIG. 5 shows a schematic diagram of a joint splicing process;
FIG. 6 shows a block diagram of a multi-modal fusion based space-based intelligence classification apparatus according to an embodiment of the present disclosure;
FIG. 7 shows a block diagram of an electronic device for implementing a multi-modal fusion-based space-based intelligence classification method of an embodiment of the present disclosure.
Detailed Description
To make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are some, but not all embodiments of the present disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.
In addition, the term "and/or" herein is only one kind of association relationship describing an associated object, and means that there may be three kinds of relationships, for example, a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
In the method, the text information characteristics and the picture information characteristics are fused to classify the information, so that the information classification efficiency is improved, and subsequent work such as quick information search is facilitated.
Fig. 1 shows a flow diagram of a multi-modal fusion-based space-based intelligence classification method 100 according to an embodiment of the present disclosure.
As shown in fig. 1, the multi-modal fusion-based space-based intelligence classification method 100 includes:
s101, respectively obtaining text information characteristics and picture information characteristics of sky-based information to be classified;
s102, extracting corresponding text characteristic vectors and corresponding picture characteristic vectors according to the text information characteristics and the picture information characteristics;
s103, calculating the text characteristic vector and the picture characteristic vector, and performing combined splicing on the text characteristic vector and the picture characteristic vector with correlation to obtain fusion characteristics of the sky-based information to be classified;
and S104, inputting the fusion characteristics into a preset classification model for classification.
FIG. 2 illustrates a logic diagram of a multi-modal fusion based space-based intelligence classification method according to an embodiment of the present disclosure.
As shown in fig. 2, the multi-modal fusion-based space-based intelligence classification method disclosed by the present disclosure takes multi-modal fusion intelligence classification recognition as a core, that is, feature extraction is performed on different modalities, and extracted modality features are fused and then classified. According to the method, the feature layer and the decision layer in the third step are fused once respectively, the feature fusion is more sufficient through the first splicing fusion, the multi-mode features are complementary, and the cross or information redundancy among the multiple modes can be filtered and eliminated through the second dropout layer fusion. Through two times of fusion, the modal representation and cross-modal complementary correlation of multi-modal data can be correctly captured, multi-modal information is reasonably processed, rich characteristic information is obtained, and the space-based information classification effect is further improved.
In step S102, extracting a corresponding text feature vector according to the text intelligence feature includes:
the method comprises the steps of obtaining a text in space-based information to be classified, carrying out vectorization representation on the text, inputting the vectorization text into a pre-trained text information feature extraction model, and obtaining a corresponding text feature vector.
Fig. 3 shows a process diagram of extracting text feature vectors.
As shown in fig. 3, vectorizing the text means that first a word vector is trained, which includes: training word vectors on a large-scale corpus by using a word vector model, and then performing vectorization representation on the text by using the trained word vector model, wherein the word vector model can be a CBOW model, word vectors are trained by using the CBOW model in a generic packet, the window is set to be 5, the min _ count is set to be 3, and the dimension of the word vectors is set to be 100 dimensions.
In some embodiments, the pre-trained text intelligence feature extraction model is a Bi-GRU model. The vectorized text is input into the Bi-GRU for feature extraction, and the text feature vector is obtained by training by using the traditional natural processing technology. The method specifically comprises the following steps: by inserting the word embedding layer into the Bi-GRU layer, the GRU and LSTM perform similarly, but with fewer GRU parameters and therefore more easily converge. Each unit in the GRU can control the flow of information by means of a reset gate and an update gate. Bi-GRUs are better able to capture and consider contextual information than single-term GRUs, and therefore, Bi-GRUs are chosen to extract text intelligence features in embodiments of the disclosure. The number of Bi-GRU hidden units is set to 64, overfitting is prevented by adopting a drop-out technology, parameters are set to 0.5, and the model is trained through a full connection layer and a softmax layer to obtain a text feature vector.
In step S102, extracting a corresponding picture feature vector according to the picture intelligence feature includes:
and obtaining a picture in the space-based information to be classified, and inputting the picture into a pre-trained picture information feature extraction model to obtain a corresponding picture feature vector.
Fig. 4 shows a process diagram of extracting a picture feature vector.
In some embodiments, the pre-trained picture intelligence feature extraction model is a VGG-16 model, comprising 13 convolutional layers, 5 pooling layers, and 2 fully-connected layers.
The convolutional neural network model generally comprises a large number of parameters needing to be learned, a large number of training sets are also needed for training the parameters, and due to limited calculation, a transfer learning technology needs to be utilized to perform fine tuning on the basis of a pre-trained model, so that the VGG-16 model pre-trained by using ImageNet in Keras a reference model is selected in the embodiment of the disclosure.
As shown in fig. 4, according to the type of picture information, in the embodiment of the present disclosure, the last full-link output in the original model is modified, and the previous full-link layer is replaced, the number of neurons is set to 3, and the model includes 13 convolutional layers, 5 pooling layers, and 2 full-link layers. Thus, the embodiment of the present disclosure obtains the picture feature vector by using the fine tuning mode training.
According to the embodiment of the disclosure, fine tuning CNNs based on transfer learning are constructed to extract image information characteristics, the image information contains abundant information content, the images are subjected to vectorization representation, and the rich information content of the images is extracted and is convenient to be fused with text information characteristics.
In step S103, the calculating the text feature vector and the picture feature vector, and jointly splicing the text feature vector and the picture feature vector having correlation to obtain the fusion feature of the sky-based information to be classified includes:
carrying out similarity calculation on the text information characteristic and the picture information characteristic according to a Pearson correlation coefficient; if the similarity reaches a threshold value, performing joint splicing; and if the similarity does not reach the threshold value, the joint splicing is not carried out.
Wherein fig. 5 shows a schematic diagram of a joint splicing process.
After the emotional characteristics of the text and the picture extracted by the Bi-GRU model and the fine-tuning VGG-16 model are respectively obtained, the relevance between the two modes needs to be judged, and the text characteristic vector and the picture characteristic vector with the relevance are jointly spliced, so that the mutual cooperation of various single modes is realized under the action of constraint conditions.
Due to the difference of information contained in each mode, the multi-mode cooperation needs to keep unique characteristics of each mode, and the cooperation method is based on a cross-mode similarity method which aims to directly measure the distance between a vector and different modes to learn a common subspace. The cross-modal correlation based approach aims at learning a shared subspace, thereby maximizing the correlation of different modal representation sets. The cross-modal similarity method keeps the similarity structure between the modalities under the constraint of similarity measurement, so that the cross-modal similarity distance of the same semantics or related objects is as small as possible, and the distance of different semantics is as large as possible.
And (3) carrying out similarity calculation between two modal feature vectors by adopting a Pearson correlation coefficient, and assuming that Q and D respectively represent fixed-length feature vectors obtained by two modes of the first-step information text and the second-step image, calculating the similarity by the following formula:
Figure 233898DEST_PATH_IMAGE001
in the formula, QiAnd DiRespectively, indicate the position of the bit where the vector is located,
Figure 842384DEST_PATH_IMAGE002
and
Figure 559804DEST_PATH_IMAGE003
the average values of Q and D, respectively. r ranges from-1 to + 1. The larger the absolute value of the correlation coefficient is, the stronger the correlation is, the closer the correlation coefficient is to 1 or-1, the stronger the correlation is, the closer the correlation coefficient is to 0, and the weaker the correlation is.
The threshold value can be represented by r, if the absolute value of r is less than 0.7, the graphics and text are considered to be irrelevant and not fused, and the classification result is directly determined by the text feature vector; and if the absolute value of r is more than or equal to 0.7, splicing the r into long vectors to perform feature layer fusion.
According to the embodiment of the present disclosure, as shown in fig. 5, the embodiment of the present disclosure performs a first fusion, and performs fusion splicing on the abstract features. The characteristic data are obtained through fusion splicing to obtain effective cross-modal characteristics, so that modal representation and cross-modal complementary correlation of the multi-modal data are accurately captured.
In step S104, the classification model is an MLP model;
the MLP model comprises a hidden layer, and the hidden layer uses a dropout algorithm;
the output layer of the MLP model comprises a softmax classifier that employs a multi-class cross-entropy loss function for classification.
Because the fusion feature vector extracted from the space-based open source information is complex, in order to better consider the influence of the complex features on the information classification effect, the MLP multi-layer perceptron neural network model with the dropout is selected to classify the feature vector after the first fusion, and the reason for adding the dropout layer is that the picture feature and the text feature vector are fused in a splicing mode in the step S103, so that the overfitting problem of the model is prevented, and the dropout layer can be regarded as the second fusion of the picture feature and the text feature. The MLP adopted in the embodiment of the disclosure only comprises one hidden layer, namely, a three-layer neural network structure, and dropout is used for the hidden layer, namely, some output features of the hidden layer are randomly set to be 0 in the training process, here, set to be 0.2, and usually set to be in the range of 0.2-0.5, the larger the value is, the more features are discarded, and the parameter belongs to the hyper-parameter. And the output layer softmax classifier adopts a multi-class cross entropy loss function, and all parameters of the MLP are connection weights and offsets among the layers, including W1, b1, W2 and b 2.
The parameters may be determined using a gradient descent method (SGD), specifically, all parameters are initialized randomly first, and then trained iteratively, and the gradient is continuously calculated and the parameters are updated until the error is sufficiently small. Therefore, the data feature fitting data can be fully utilized, so that the classification result of the fused feature vector by the model is more reliable.
In some embodiments, the fusion features are divided into training set samples and test set samples, the training set samples are input into the MLP model for training, then the test set samples are input into the trained MLP model, and whether the classification result can be output is observed, so as to determine whether the training is completed.
It is noted that while for simplicity of explanation, the foregoing method embodiments have been described as a series of acts or combination of acts, it will be appreciated by those skilled in the art that the present disclosure is not limited by the order of acts, as some steps may, in accordance with the present disclosure, occur in other orders and concurrently. Further, those skilled in the art should also appreciate that the embodiments described in the specification are exemplary embodiments and that acts and modules referred to are not necessarily required by the disclosure.
The above is a description of embodiments of the method, and the embodiments of the apparatus are further described below.
Fig. 6 shows a block diagram of a multi-modal fusion based space-based intelligence classification apparatus 600 according to an embodiment of the present disclosure.
As shown in fig. 6, the space-based intelligence classification apparatus 600 based on multi-modal fusion includes:
the acquisition module 601 is used for respectively acquiring text information characteristics and picture information characteristics of the space-based information to be classified;
the feature extraction module 602 is configured to extract corresponding text feature vectors and picture feature vectors according to the text intelligence features and the picture intelligence features;
the first fusion module 603 is configured to calculate the text feature vector and the picture feature vector, and jointly splice the text feature vector and the picture feature vector having correlation to obtain a fusion feature of the sky-based information to be classified;
and a second fusing module 604, configured to input the fused features into a preset classification model for classification.
In some embodiments, the system further comprises a text feature vector extraction module for obtaining a text in the space-based intelligence to be classified, performing vectorization representation on the text, and inputting the vectorized text into a pre-trained text intelligence feature extraction model to obtain a corresponding text feature vector.
In some embodiments, the image feature vector extraction module is further included, and is configured to obtain an image in the space-based information to be classified, and input the image into a pre-trained image information feature extraction model to obtain a corresponding image feature vector.
In some embodiments, the system further comprises a joint splicing module for performing similarity calculation on the text intelligence features and the picture intelligence features according to a pearson correlation coefficient; if the similarity reaches a threshold value, performing joint splicing; and if the similarity does not reach the threshold value, the joint splicing is not carried out.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the described module may refer to the corresponding process in the foregoing method embodiment, and is not described herein again.
In the technical scheme of the disclosure, the acquisition, storage, application and the like of the personal information of the related user all accord with the regulations of related laws and regulations, and do not violate the good customs of the public order.
The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.
FIG. 7 illustrates a schematic block diagram of an electronic device 700 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
The device 700 comprises a computing unit 701, which may perform various suitable actions and processes according to a computer program stored in a Read Only Memory (ROM) 702 or a computer program loaded from a storage unit 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data required for the operation of the device 700 can also be stored. The computing unit 701, the ROM 702, and the RAM 703 are connected to each other by a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
A number of components in the electronic device 700 are connected to the I/O interface 705, including: an input unit 706 such as a keyboard, a mouse, or the like; an output unit 707 such as various types of displays, speakers, and the like; a storage unit 708 such as a magnetic disk, optical disk, or the like; and a communication unit 709 such as a network card, modem, wireless communication transceiver, etc. The communication unit 709 allows the device 700 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
Computing unit 701 may be a variety of general purpose and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 701 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The computing unit 701 performs the various methods and processes described above, such as the method 100. For example, in some embodiments, the method 100 may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as the storage unit 708. In some embodiments, part or all of a computer program may be loaded onto and/or installed onto device 700 via ROM 702 and/or communications unit 709. When loaded into RAM 703 and executed by the computing unit 801, may perform one or more of the steps of the method 100 described above. Alternatively, in other embodiments, the computing unit 701 may be configured to perform the method 100 by any other suitable means (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server with a combined blockchain.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.
The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims (8)

1. A space-based intelligence classification method based on multi-modal fusion is characterized by comprising the following steps:
respectively obtaining text information characteristics and picture information characteristics of the space-based information to be classified;
extracting corresponding text characteristic vectors and corresponding picture characteristic vectors according to the text information characteristics and the picture information characteristics;
calculating the text characteristic vector and the picture characteristic vector, and performing joint splicing on the text characteristic vector and the picture characteristic vector with correlation to obtain fusion characteristics of the sky-based information to be classified;
and inputting the fusion characteristics into a preset classification model for classification.
2. The multi-modal fusion-based space-based intelligence classification method of claim 1, wherein extracting corresponding text feature vectors from text intelligence features comprises:
the method comprises the steps of obtaining a text in space-based information to be classified, carrying out vectorization representation on the text, inputting the vectorization text into a pre-trained text information feature extraction model, and obtaining a corresponding text feature vector.
3. The multi-modal fusion-based space-based intelligence classification method of claim 2, wherein extracting corresponding picture feature vectors according to picture intelligence features comprises:
and obtaining a picture in the space-based information to be classified, and inputting the picture into a pre-trained picture information feature extraction model to obtain a corresponding picture feature vector.
4. The method for space-based intelligence taxonomy based on multimodal fusion of claim 3,
the pre-trained text information feature extraction model is a Bi-GRU model;
the pre-trained image information feature extraction model is a VGG-16 model and comprises 13 convolutional layers, 5 pooling layers and 2 full-connection layers.
5. The multi-modal fusion-based space-based intelligence classification method of claim 1, wherein the calculating the text feature vector and the picture feature vector, and jointly splicing the text feature vector with correlation and the picture feature vector to obtain the fusion features of the space-based intelligence to be classified comprises:
carrying out similarity calculation on the text information characteristic and the picture information characteristic according to a Pearson correlation coefficient; if the similarity reaches a threshold value, performing joint splicing; and if the similarity does not reach the threshold value, the joint splicing is not carried out.
6. The multi-modal fusion based space-based intelligence classification method of claim 1, wherein the classification model is an MLP model;
the MLP model comprises a hidden layer, and the hidden layer uses a dropout algorithm;
the output layer of the MLP model comprises a softmax classifier that employs a multi-class cross-entropy loss function for classification.
7. A space-based information classification device based on multi-modal fusion is characterized in that,
the acquisition module is used for respectively acquiring text information characteristics and picture information characteristics of the space-based information to be classified;
the feature extraction module is used for extracting corresponding text feature vectors and picture feature vectors according to the text information features and the picture information features;
the first fusion module is used for calculating the text characteristic vector and the picture characteristic vector, and jointly splicing the text characteristic vector with correlation and the picture characteristic vector to obtain fusion characteristics of the sky-based information to be classified;
and the second fusion module is used for inputting the fusion characteristics into a preset classification model for classification.
8. An electronic device, comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-6.
CN202210317228.8A 2022-03-29 2022-03-29 Space-based information classification method and device based on multi-mode fusion and electronic equipment Pending CN114418038A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210317228.8A CN114418038A (en) 2022-03-29 2022-03-29 Space-based information classification method and device based on multi-mode fusion and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210317228.8A CN114418038A (en) 2022-03-29 2022-03-29 Space-based information classification method and device based on multi-mode fusion and electronic equipment

Publications (1)

Publication Number Publication Date
CN114418038A true CN114418038A (en) 2022-04-29

Family

ID=81263770

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210317228.8A Pending CN114418038A (en) 2022-03-29 2022-03-29 Space-based information classification method and device based on multi-mode fusion and electronic equipment

Country Status (1)

Country Link
CN (1) CN114418038A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114970743A (en) * 2022-06-17 2022-08-30 中国科学院地理科学与资源研究所 Multi-source remote sensing rainfall data fusion method based on multi-modal deep learning
CN116846688A (en) * 2023-08-30 2023-10-03 南京理工大学 Interpretable flow intrusion detection method based on CNN

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109145974A (en) * 2018-08-13 2019-01-04 广东工业大学 One kind being based on the matched multi-level image Feature fusion of picture and text
CN111275085A (en) * 2020-01-15 2020-06-12 重庆邮电大学 Online short video multi-modal emotion recognition method based on attention fusion
US20210011941A1 (en) * 2019-07-14 2021-01-14 Alibaba Group Holding Limited Multimedia file categorizing, information processing, and model training method, system, and device
CN113360599A (en) * 2021-05-18 2021-09-07 苏州海赛人工智能有限公司 Multi-source heterogeneous information convergence cooperative processing platform based on content identification
CN113535949A (en) * 2021-06-15 2021-10-22 杭州电子科技大学 Multi-mode combined event detection method based on pictures and sentences
CN113822283A (en) * 2021-06-30 2021-12-21 腾讯科技(深圳)有限公司 Text content processing method and device, computer equipment and storage medium
CN114078474A (en) * 2021-11-09 2022-02-22 京东科技信息技术有限公司 Voice conversation processing method and device based on multi-modal characteristics and electronic equipment
CN114155270A (en) * 2021-11-10 2022-03-08 南方科技大学 Pedestrian trajectory prediction method, device, equipment and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109145974A (en) * 2018-08-13 2019-01-04 广东工业大学 One kind being based on the matched multi-level image Feature fusion of picture and text
US20210011941A1 (en) * 2019-07-14 2021-01-14 Alibaba Group Holding Limited Multimedia file categorizing, information processing, and model training method, system, and device
CN111275085A (en) * 2020-01-15 2020-06-12 重庆邮电大学 Online short video multi-modal emotion recognition method based on attention fusion
CN113360599A (en) * 2021-05-18 2021-09-07 苏州海赛人工智能有限公司 Multi-source heterogeneous information convergence cooperative processing platform based on content identification
CN113535949A (en) * 2021-06-15 2021-10-22 杭州电子科技大学 Multi-mode combined event detection method based on pictures and sentences
CN113822283A (en) * 2021-06-30 2021-12-21 腾讯科技(深圳)有限公司 Text content processing method and device, computer equipment and storage medium
CN114078474A (en) * 2021-11-09 2022-02-22 京东科技信息技术有限公司 Voice conversation processing method and device based on multi-modal characteristics and electronic equipment
CN114155270A (en) * 2021-11-10 2022-03-08 南方科技大学 Pedestrian trajectory prediction method, device, equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
张国标 等: "基于多模态特征融合的社交媒体虚假新闻检测", 《情报科学》, vol. 39, no. 10, 31 October 2021 (2021-10-31), pages 3 - 4 *
李安: "《语料库语言学及Python实现》", 31 December 2018, pages: 13 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114970743A (en) * 2022-06-17 2022-08-30 中国科学院地理科学与资源研究所 Multi-source remote sensing rainfall data fusion method based on multi-modal deep learning
CN114970743B (en) * 2022-06-17 2022-11-08 中国科学院地理科学与资源研究所 Multi-source remote sensing rainfall data fusion method based on multi-modal deep learning
CN116846688A (en) * 2023-08-30 2023-10-03 南京理工大学 Interpretable flow intrusion detection method based on CNN
CN116846688B (en) * 2023-08-30 2023-11-21 南京理工大学 Interpretable flow intrusion detection method based on CNN

Similar Documents

Publication Publication Date Title
CN112966522B (en) Image classification method and device, electronic equipment and storage medium
CN113326764B (en) Method and device for training image recognition model and image recognition
CN111310672A (en) Video emotion recognition method, device and medium based on time sequence multi-model fusion modeling
CN109284406B (en) Intention identification method based on difference cyclic neural network
CN114418038A (en) Space-based information classification method and device based on multi-mode fusion and electronic equipment
WO2022156561A1 (en) Method and device for natural language processing
CN116152833B (en) Training method of form restoration model based on image and form restoration method
CN111985525B (en) Text recognition method based on multi-mode information fusion processing
CN113837308B (en) Knowledge distillation-based model training method and device and electronic equipment
CN114494784A (en) Deep learning model training method, image processing method and object recognition method
JP2022078310A (en) Image classification model generation method, device, electronic apparatus, storage medium, computer program, roadside device and cloud control platform
CN112862005A (en) Video classification method and device, electronic equipment and storage medium
CN114429633A (en) Text recognition method, model training method, device, electronic equipment and medium
CN114782722B (en) Image-text similarity determination method and device and electronic equipment
CN114037059A (en) Pre-training model, model generation method, data processing method and data processing device
CN113705715B (en) Time sequence classification method based on LSTM and multi-scale FCN
CN114707638A (en) Model training method, model training device, object recognition method, object recognition device, object recognition medium and product
CN115170919A (en) Image processing model training method, image processing device, image processing equipment and storage medium
CN114611521A (en) Entity identification method, device, equipment and storage medium
CN114821063A (en) Semantic segmentation model generation method and device and image processing method
CN113806541A (en) Emotion classification method and emotion classification model training method and device
CN113886543A (en) Method, apparatus, medium, and program product for generating an intent recognition model
CN113094504A (en) Self-adaptive text classification method and device based on automatic machine learning
CN113378781B (en) Training method and device of video feature extraction model and electronic equipment
CN115879446B (en) Text processing method, deep learning model training method, device and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 100085 room 703, 7 / F, block C, 8 malianwa North Road, Haidian District, Beijing

Applicant after: Beijing daoda Tianji Technology Co.,Ltd.

Address before: 100085 room 703, 7 / F, block C, 8 malianwa North Road, Haidian District, Beijing

Applicant before: Beijing daoda Tianji Technology Co.,Ltd.

CB02 Change of applicant information
RJ01 Rejection of invention patent application after publication

Application publication date: 20220429

RJ01 Rejection of invention patent application after publication