CN113343676B - Sentence classification method and system based on convolutional neural network - Google Patents

Sentence classification method and system based on convolutional neural network Download PDF

Info

Publication number
CN113343676B
CN113343676B CN202110394681.4A CN202110394681A CN113343676B CN 113343676 B CN113343676 B CN 113343676B CN 202110394681 A CN202110394681 A CN 202110394681A CN 113343676 B CN113343676 B CN 113343676B
Authority
CN
China
Prior art keywords
neural network
convolutional neural
sentence
vector
vectors
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN202110394681.4A
Other languages
Chinese (zh)
Other versions
CN113343676A (en
Inventor
解福
贾艺鸣
王森
孟虎
徐传杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Normal University
Original Assignee
Shandong Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Normal University filed Critical Shandong Normal University
Priority to CN202110394681.4A priority Critical patent/CN113343676B/en
Publication of CN113343676A publication Critical patent/CN113343676A/en
Application granted granted Critical
Publication of CN113343676B publication Critical patent/CN113343676B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a sentence classification method and a system based on a convolutional neural network, which comprises the steps of coding letters and characters in a sentence, and respectively obtaining letter vectors and character vectors according to the context relations of the letters and the characters; performing dot multiplication on the character vectors and the letter vectors in sequence according to the sentence structure to obtain a mixed vector; and extracting features of the mixed vector based on a convolutional neural network, and obtaining a sentence classification result according to a preset tag type based on the extracted features. Extracting an alphabetic vector and a character vector by using word2vec, processing a mixed vector by using a convolutional neural network, extracting features, and processing the extracted features by using a softmax layer to obtain corresponding sentence labels; the method can meet the requirements of machine translation, voice recognition, character recognition and other natural language processing, and solves the problems of poor precision, low speed and poor real-time performance of a sentence classification method.

Description

Sentence classification method and system based on convolutional neural network
Technical Field
The invention relates to the technical field of natural language processing, in particular to a sentence classification method and system based on a convolutional neural network.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
In the field of natural language processing, the classification of sentences is very important, so that the machine can be helped to better understand the meaning of the sentences, and the understanding of the machine to the language is further optimized. Natural language processing is a science integrating linguistics, computer science and mathematics, so that research in the field relates to natural language, namely the language used by people daily, and the natural language processing is closely related to and is important to the research of linguistics. Natural language processing is not a general research on natural language but is to develop a computer system, particularly a software system, which can efficiently realize efficient communication using natural language, and thus it is a part of computer science.
In recent years, a certain research foundation has been accumulated in the field, and methods of both big data and artificial intelligence are generally used. For big data, a large amount of data is contained in the data, meaningful data in the data are mined out for people to use, and the method has very important significance; for artificial intelligence, the accuracy of machine translation, speech recognition and other aspects is more and more important, and how to quickly implement machine translation, speech recognition and the like has attracted more and more scholars to research.
However, whether big data mining or artificial intelligence language processing is adopted, the classification of sentences is more and more important, and the classification of sentences can better perform the next work.
Disclosure of Invention
In order to solve the problems, the invention provides a sentence classification method and a system based on a convolutional neural network, wherein word2vec is used for extracting an alphabetic vector and a character vector, the convolutional neural network is used for processing a mixed vector, the feature extraction is carried out, and the extracted features are processed through a softmax layer to obtain corresponding sentence labels; the method can meet the requirements of natural language processing such as machine translation, voice recognition, character recognition and the like, and solves the problems of poor precision, low speed and poor real-time performance of a sentence classification method.
In order to achieve the purpose, the invention adopts the following technical scheme:
in a first aspect, the present invention provides a sentence classification method based on a convolutional neural network, including:
after letters and characters in a sentence are coded, respectively obtaining letter vectors and character vectors according to the context relationship of the letters and the characters;
according to the sentence structure, cross multiplication is carried out on the character vectors and the character vectors in sequence to obtain a mixed vector;
and extracting features of the mixed vector based on a convolutional neural network, and obtaining a sentence classification result according to a preset tag type based on the extracted features.
In a second aspect, the present invention provides a sentence classification system based on a convolutional neural network, comprising:
the vector extraction module is configured to encode letters and characters in a sentence and then respectively obtain letter vectors and character vectors according to the context relationship of the letters and the characters;
the mixed vector calculation module is configured to perform cross multiplication on the letter vectors and the character vectors in sequence according to the sentence structure to obtain mixed vectors;
and the classification module is configured to extract the features of the mixed vector based on the convolutional neural network and obtain a sentence classification result according to the preset tag type based on the extracted features.
In a third aspect, the present invention provides an electronic device comprising a memory and a processor, and computer instructions stored in the memory and executed on the processor, wherein when the computer instructions are executed by the processor, the method of the first aspect is performed.
In a fourth aspect, the present invention provides a computer readable storage medium for storing computer instructions which, when executed by a processor, perform the method of the first aspect.
Compared with the prior art, the invention has the following beneficial effects:
the method comprises the steps of firstly, obtaining an alphabetic vector and a character vector corresponding to a sentence by using a word2vec vector, performing cross multiplication on the obtained alphabetic vector and the character vector to obtain a mixed vector, and performing feature extraction on the mixed vector by using a convolutional neural network, so that the method is wide in span and more detailed; processing the extracted features through a softmax layer to obtain corresponding sentence labels; the method has good generalization and robustness, can meet the requirements of machine translation, speech recognition, character recognition and other natural language processing aspects, has high speed and high precision, and solves the problems that the sentence classification method has poor precision and low speed and cannot be applied in real time.
Advantages of additional aspects of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, are included to provide a further understanding of the invention, and are included to illustrate an exemplary embodiment of the invention and not to limit the invention.
Fig. 1 is a flowchart of a sentence classification method based on a convolutional neural network according to embodiment 1 of the present invention;
fig. 2 is a structure diagram of the softmax classification provided in embodiment 1 of the present invention.
Detailed Description
The invention is further explained by the following embodiments in conjunction with the drawings.
It is to be understood that the following detailed description is exemplary and is intended to provide further explanation of the invention as claimed. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the invention. As used herein, the singular is intended to include the plural unless the context clearly dictates otherwise, and furthermore, it should be understood that the terms "comprises" and "comprising," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The embodiments and features of the embodiments of the present invention may be combined with each other without conflict.
Example 1
As shown in fig. 1, the present embodiment provides a sentence classification method based on a convolutional neural network, including:
s1: after letters and characters in a sentence are coded, respectively obtaining letter vectors and character vectors according to the context relations of the letters and the characters;
s2: according to the sentence structure, cross multiplication is carried out on the character vectors and the character vectors in sequence to obtain a mixed vector;
s3: and extracting the characteristics of the mixed vector based on the convolutional neural network, and obtaining a sentence classification result according to the preset label type based on the extracted characteristics.
In step S1, this embodiment obtains an alphabet vector and a character vector by using a word2vec vector, where the word2vec vector is trained from googlelens to obtain 1000 hundred million words, the dimension of the vector is 300, the training is performed by using a continuous bag-of-words structure, and a random initialization is performed on words not in a word set trained in advance.
Specifically, the collected sentence data includes two natural languages of characters and letters, and the method for extracting the alphabet vector includes:
s1-1: processing letters by using a vocabulary which is constructed by a corpus, coding the letters by one-hot, representing the letters into vectors, wherein the dimension of each word is 10000 dimensions, and each dimension value in the vectors is only 0 or 1;
s1-2: constructing a letter language model and establishing a context relationship; the method comprises the steps that a ship-gram is used for constructing a letter language model, when a word appears in a sentence, the word has certain relation with the context of the word, and the most possible context relation of the word is deduced through the word and a corresponding corpus so as to improve feature significance and increase accuracy.
Secondly, the extraction method of the character vector comprises the following steps:
s1-3: processing characters by using a vocabulary table constructed by a corpus, encoding the characters by one-hot, and representing the characters into vectors, wherein the dimension of each word is 10000 dimensions, and each dimension value in the vectors is only 0 or 1;
s1-4: constructing a character language model and establishing a context relationship; the method comprises the steps of constructing a character language model by using a ship-gram, and when a Chinese character appears in a sentence, the Chinese character has certain relation with the context, deducing the most possible context relation of the Chinese character through the Chinese character and a corresponding corpus so as to improve the feature significance and increase the accuracy.
In step S2, for different sentences and different sentence structures, processing the alphabet vector P and the character vector Q in order and structure positions, and performing cross multiplication on the alphabet vector P and the character vector Q to obtain a mixed vector L:
Figure DEST_PATH_IMAGE001
in the step S3, a natural language processing model is constructed based on the convolutional neural network to capture the most important features in the hybrid vector, so that the value of each feature map is the highest.
The embodiment improves the convolutional neural network, constructs a convolutional neural network variant structure, and uses
Figure 628887DEST_PATH_IMAGE002
Represent the first in a sentenceiIn a positionkDimension character vector orkDimension letter vector, thus length ofnThe sentence of (a) can be expressed using the following formula:
Figure DEST_PATH_IMAGE003
wherein the content of the first and second substances,
Figure 296760DEST_PATH_IMAGE004
the operation of the join is represented by the join operator,
Figure DEST_PATH_IMAGE005
a sentence is represented which is,x i represent the first in a sentenceiAt one positionkDimension character vector orkWei characterA mother vector.
The present embodiment extracts a plurality of features of the hybrid vector, which are formed in the penultimate layer of the convolutional neural network and transferred to the fully connected softmax layer, through a plurality of filters having different window sizes.
In this embodiment, a Dropout layer is added to the penultimate layer of the convolutional neural network for regularization, so that overfitting of the convolutional neural network is prevented by using a random neuron node abandoning mode, an output result is optimized, and accuracy of feature extraction is improved.
In step S3, as shown in fig. 2, the present embodiment obtains a sentence classification result according to a preset tag type by processing the extracted features through the softmax layer, so as to meet the requirements of natural language processing such as machine translation, speech recognition, character recognition, and the like.
Example 2
The embodiment provides a sentence classification system based on a convolutional neural network, which comprises:
the vector extraction module is configured to encode letters and characters in a sentence and then respectively obtain letter vectors and character vectors according to the context relationship of the letters and the characters;
the mixed vector calculation module is configured to perform cross multiplication on the letter vectors and the character vectors in sequence according to the sentence structure to obtain mixed vectors;
and the classification module is configured to extract features of the mixed vector based on a convolutional neural network, and obtain a sentence classification result according to a preset tag type based on the extracted features.
It should be noted that the modules correspond to the steps described in embodiment 1, and the modules are the same as the corresponding steps in the implementation examples and application scenarios, but are not limited to the disclosure in embodiment 1. It should be noted that the modules described above as part of a system may be implemented in a computer system such as a set of computer-executable instructions.
In further embodiments, there is also provided:
an electronic device comprising a memory and a processor and computer instructions stored on the memory and executed on the processor, the computer instructions when executed by the processor performing the method of embodiment 1. For brevity, no further description is provided herein.
It should be understood that in this embodiment, the processor may be a central processing unit CPU, and the processor may also be other general purpose processor, a digital signal processor DSP, an application specific integrated circuit ASIC, an off-the-shelf programmable gate array FPGA or other programmable logic device, a discrete gate or transistor logic device, a discrete hardware component, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory may include both read-only memory and random access memory, and may provide instructions and data to the processor, and a portion of the memory may also include non-volatile random access memory. For example, the memory may also store device type information.
A computer readable storage medium storing computer instructions which, when executed by a processor, perform the method described in embodiment 1.
The method in embodiment 1 may be directly implemented by a hardware processor, or implemented by a combination of hardware and software modules in the processor. The software modules may be located in ram, flash, rom, prom, or eprom, registers, among other storage media as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor. To avoid repetition, it is not described in detail here.
Those of ordinary skill in the art will appreciate that the various illustrative elements, i.e., algorithm steps, described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes will occur to those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Although the embodiments of the present invention have been described with reference to the accompanying drawings, it is not intended to limit the scope of the present invention, and it should be understood by those skilled in the art that various modifications and variations can be made without inventive efforts by those skilled in the art based on the technical solution of the present invention.

Claims (10)

1. A sentence classification method based on a convolutional neural network is characterized by comprising the following steps:
after letters and characters in a sentence are coded, respectively obtaining letter vectors and character vectors according to the context relationship of the letters and the characters;
according to the sentence structure, cross multiplication is carried out on the character vectors and the character vectors in sequence to obtain a mixed vector;
and extracting the characteristics of the mixed vector based on the convolutional neural network, and obtaining a sentence classification result according to the preset label type based on the extracted characteristics.
2. The method of claim 1, wherein the letters and characters in the sentence are encoded by one-hot.
3. The sentence classification method based on the convolutional neural network as claimed in claim 1, wherein the process of constructing the context relationship between letters and characters specifically comprises: and constructing a language model by using the ship-gram, and deducing the context relationship through letters, characters and a corpus corresponding to the letters and the characters.
4. The sentence classification method based on the convolutional neural network as claimed in claim 1, wherein the process of feature extraction of the mixture vector based on the convolutional neural network comprises: features of the hybrid vector are extracted by a plurality of filters having different window sizes.
5. The method for classifying sentences based on convolutional neural network as claimed in claim 1, wherein the process of feature extraction of the mixture vector based on convolutional neural network further comprises: the hybrid vector is characterized in the penultimate layer of the convolutional neural network and passed to the fully connected softmax layer.
6. The method for classifying sentences based on convolutional neural network as claimed in claim 1, wherein the structure of the convolutional neural network comprises: a Dropout layer is added on the penultimate layer of the convolutional neural network.
7. The method of sentence classification based on convolutional neural network of claim 1, wherein the structure of the convolutional neural network further comprises: over-fitting of the convolutional neural network is prevented by using a random discarding of neuron nodes on the penultimate layer of the convolutional neural network.
8. A system for sentence classification based on a convolutional neural network, comprising:
the vector extraction module is configured to encode letters and characters in a sentence and then respectively obtain letter vectors and character vectors according to the context relations of the letters and the characters;
the mixed vector calculation module is configured to perform cross multiplication on the letter vectors and the character vectors in sequence according to the sentence structure to obtain mixed vectors;
and the classification module is configured to extract the features of the mixed vector based on the convolutional neural network and obtain a sentence classification result according to the preset tag type based on the extracted features.
9. An electronic device comprising a memory and a processor and computer instructions stored on the memory and executed on the processor, the computer instructions when executed by the processor performing the method of any of claims 1-7.
10. A computer-readable storage medium storing computer instructions which, when executed by a processor, perform the method of any one of claims 1 to 7.
CN202110394681.4A 2021-04-13 2021-04-13 Sentence classification method and system based on convolutional neural network Expired - Fee Related CN113343676B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110394681.4A CN113343676B (en) 2021-04-13 2021-04-13 Sentence classification method and system based on convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110394681.4A CN113343676B (en) 2021-04-13 2021-04-13 Sentence classification method and system based on convolutional neural network

Publications (2)

Publication Number Publication Date
CN113343676A CN113343676A (en) 2021-09-03
CN113343676B true CN113343676B (en) 2022-12-06

Family

ID=77467942

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110394681.4A Expired - Fee Related CN113343676B (en) 2021-04-13 2021-04-13 Sentence classification method and system based on convolutional neural network

Country Status (1)

Country Link
CN (1) CN113343676B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110162799A (en) * 2018-11-28 2019-08-23 腾讯科技(深圳)有限公司 Model training method, machine translation method and relevant apparatus and equipment
CN111382243A (en) * 2018-12-29 2020-07-07 深圳市优必选科技有限公司 Text category matching method, text category matching device and terminal

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110119505A (en) * 2018-02-05 2019-08-13 阿里巴巴集团控股有限公司 Term vector generation method, device and equipment

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110162799A (en) * 2018-11-28 2019-08-23 腾讯科技(深圳)有限公司 Model training method, machine translation method and relevant apparatus and equipment
CN111382243A (en) * 2018-12-29 2020-07-07 深圳市优必选科技有限公司 Text category matching method, text category matching device and terminal

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
一种基于部件CNN的网络安全命名实体识别方法;魏笑等;《计算机与数字工程》;20200120(第01期);全文 *
基于深度学习的维吾尔语语句情感倾向分析;李敏等;《计算机工程与设计》;20160816(第08期);全文 *

Also Published As

Publication number Publication date
CN113343676A (en) 2021-09-03

Similar Documents

Publication Publication Date Title
CN107357789B (en) Neural machine translation method fusing multi-language coding information
Alwehaibi et al. Comparison of pre-trained word vectors for arabic text classification using deep learning approach
CN108830287A (en) The Chinese image, semantic of Inception network integration multilayer GRU based on residual error connection describes method
CN107871158A (en) A kind of knowledge mapping of binding sequence text message represents learning method and device
CN111881677A (en) Address matching algorithm based on deep learning model
CN111831783B (en) Method for extracting chapter-level relation
CN113420557B (en) Chinese named entity recognition method, system, equipment and storage medium
CN112861524A (en) Deep learning-based multilevel Chinese fine-grained emotion analysis method
WO2023159767A1 (en) Target word detection method and apparatus, electronic device and storage medium
CN111767697B (en) Text processing method and device, computer equipment and storage medium
CN111597815A (en) Multi-embedded named entity identification method, device, equipment and storage medium
US11615247B1 (en) Labeling method and apparatus for named entity recognition of legal instrument
CN115496072A (en) Relation extraction method based on comparison learning
CN115064154A (en) Method and device for generating mixed language voice recognition model
CN113255331B (en) Text error correction method, device and storage medium
CN113343676B (en) Sentence classification method and system based on convolutional neural network
CN113505222A (en) Government affair text classification method and system based on text circulation neural network
CN111523325A (en) Chinese named entity recognition method based on strokes
CN114611529B (en) Intention recognition method and device, electronic equipment and storage medium
CN109960782A (en) A kind of Tibetan language segmenting method and device based on deep neural network
CN116432705A (en) Text generation model construction method, text generation device, equipment and medium
CN115238698A (en) Biomedical named entity identification method and system
CN113377908B (en) Method for extracting aspect-level emotion triple based on learnable multi-word pair scorer
CN112634878B (en) Speech recognition post-processing method and system and related equipment
CN110955768B (en) Question-answering system answer generation method based on syntactic analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20221206