CN112800183B - Content name data processing method and terminal equipment - Google Patents
Content name data processing method and terminal equipment Download PDFInfo
- Publication number
- CN112800183B CN112800183B CN202110212680.3A CN202110212680A CN112800183B CN 112800183 B CN112800183 B CN 112800183B CN 202110212680 A CN202110212680 A CN 202110212680A CN 112800183 B CN112800183 B CN 112800183B
- Authority
- CN
- China
- Prior art keywords
- initial
- vector
- name data
- feature vectors
- target feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/3332—Query translation
- G06F16/3335—Syntactic pre-processing, e.g. stopword elimination, stemming
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/242—Dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application is applicable to the technical field of data processing, and provides a content name data processing method and terminal equipment, wherein the method comprises the following steps: acquiring the name data of the content to be processed, and converting the name data of the content to be processed into an initial matrix vector according to a preset query dictionary; extracting features of the initial matrix vectors to obtain a first number of initial feature vectors; respectively performing dimension reduction on each initial feature vector to obtain a first number of target feature vectors; and linearly combining the first number of target feature vectors to obtain target code values. According to the method, the target feature vectors after feature extraction and dimension reduction are further integrated, so that semantic information can be reserved to the greatest extent, data storage consumption is reduced, and application requirements of various different scenes can be met.
Description
Technical Field
The application belongs to the technical field of data processing, and particularly relates to a content name data processing method and terminal equipment.
Background
The content center network is an internet new technology which is developed in recent years, the communication process is realized by using variable-length borderless content name data instead of IP addresses, the unique route node buffer storage mechanism and the special forwarding plane structure thereof realize the data sharing in the true sense, meet the communication requirements of mobility, high reliability and the like, and greatly improve the data transmission efficiency of mobile communication.
In the prior art, the processing of variable-length borderless content name data, or the loss of a large amount of content name data information, or the generation of great memory consumption cannot meet the actual demands.
Disclosure of Invention
In view of this, the embodiment of the application provides a content name data processing method and terminal equipment, so as to solve the problems that in the prior art, the content name data has large information loss and large storage consumption, and cannot meet the actual demands in the processing method of the variable-length borderless content name data.
A first aspect of an embodiment of the present application provides a content name data processing method, including:
acquiring the name data of the content to be processed, and converting the name data of the content to be processed into an initial matrix vector according to a preset query dictionary;
extracting features of the initial matrix vectors to obtain a first number of initial feature vectors;
respectively performing dimension reduction on each initial feature vector to obtain a first number of target feature vectors;
and linearly combining the first number of target feature vectors to obtain target code values.
A second aspect of an embodiment of the present application provides a content name data processing apparatus, including:
the coding module is used for acquiring the name data of the content to be processed and converting the name data of the content to be processed into an initial matrix vector according to a preset query dictionary;
the feature extraction module is used for extracting features of the initial matrix vectors to obtain a first number of initial feature vectors;
the dimension reduction module is used for respectively reducing the dimension of each initial feature vector to obtain a first number of target feature vectors;
and the linear combination module is used for carrying out linear combination on the first number of target feature vectors to obtain target code values.
A third aspect of the embodiments of the present application provides a terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the content name data processing method as provided in the first aspect of the embodiments of the present application when the computer program is executed by the processor.
A fourth aspect of the embodiments of the present application provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the steps of the content name data processing method as provided in the first aspect of the embodiments of the present application.
The embodiment of the application provides a content name data processing method, which comprises the following steps: acquiring the name data of the content to be processed, and converting the name data of the content to be processed into an initial matrix vector according to a preset query dictionary; extracting features of the initial matrix vectors to obtain a first number of initial feature vectors; respectively performing dimension reduction on each initial feature vector to obtain a first number of target feature vectors; and linearly combining the first number of target feature vectors to obtain target code values. According to the embodiment of the application, the target feature vectors after feature extraction and dimension reduction are further integrated, so that semantic information can be reserved to the greatest extent, data storage consumption is reduced, and application requirements of various different scenes can be met.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic implementation flow diagram of a content name data processing method according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a content name data processing device according to an embodiment of the present application;
fig. 3 is a schematic diagram of a terminal device according to an embodiment of the present application.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth such as the particular system architecture, techniques, etc., in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
In order to illustrate the technical scheme of the application, the following description is made by specific examples.
Referring to fig. 1, an embodiment of the present application provides a content name data processing method, including:
s101: acquiring the name data of the content to be processed, and converting the name data of the content to be processed into an initial matrix vector according to a preset query dictionary;
s102: extracting features of the initial matrix vectors to obtain a first number of initial feature vectors;
s103: respectively performing dimension reduction on each initial feature vector to obtain a first number of target feature vectors;
s104: and linearly combining the first number of target feature vectors to obtain target code values.
All possible characters are encoded, and each character corresponds to a unique code value to form a preset query dictionary. In the embodiment of the application, characters in the content name data to be processed are matched with a preset query dictionary one by one according to the sequence, and a code value corresponding to each character is obtained. For example, each character in the preset query dictionary is represented as an N-dimensional vector containing only 0 and 1, e.g., character a may be represented as a 1*N-dimensional vector (1, 0 …, 0), and each character corresponds to a unique vector. The length of the actual character string of the content name data to be processed is M, character-by-character matching is carried out through a query dictionary, and the input content name data to be processed is converted into an initial matrix vector with M-dimension and N-dimension.
The linear combination is used for further integrating the target feature vectors, analyzing the proportion of different feature vectors in the semantic information expression, removing repeated features, distinguishing the importance of each feature vector, expanding the matrix vector, and linearly combining each feature to obtain the simplest target code value. The linear combination mode can be set according to practical application requirements.
According to the embodiment of the application, firstly, the content name data to be processed is encoded, further the target feature vectors after feature extraction and dimension reduction are further integrated, semantic information is reserved to the greatest extent while different features are combined, different content name data are distinguished obviously by code values according to the semantic information, the data storage consumption is reduced, and the retrieval efficiency is improved. Meanwhile, the embodiment of the application adopts character coding, is applied to the forwarding plane of the content center network, and can meet the application requirements of various different scenes.
In some embodiments, S102 may include:
s1021: and carrying out convolution operation on the initial matrix vectors to obtain a first number of initial feature vectors.
In the embodiment of the application, the feature extraction can be carried out on the initial matrix vectors through convolution operation, so as to obtain the initial feature vectors with fixed first quantity and different connotations. For example, the initial matrix vector is a matrix vector in m×n dimensions, and the initial matrix vector and the convolution kernels in x (first number) k×k dimensions are gradually subjected to convolution operation to obtain x (M-k+1) x (N-k+1) dimension initial feature vectors, so as to realize extraction of different features of the initial matrix vector. Different numbers of convolution kernels can be set according to actual application requirements to carry out convolution for multiple times, the dimension of the matrix vector is moderately reduced, and different features of the initial matrix vector are further extracted.
In some embodiments, before S102, the method may further include:
s105: and expanding the initial matrix vector according to the size of the convolution kernel in the convolution operation to obtain an expanded initial matrix vector.
Accordingly, S1021 may include:
and performing convolution operation on the expanded initial matrix vectors to obtain a first number of initial feature vectors.
The dimension of the initial matrix vector should be adapted to the size of the convolution kernel, and in order to improve the applicability of the method, the size of the convolution kernel is fixed in the embodiment of the present application. Therefore, the maximum length L1 of a large number of real name data and the length L2 of all different characters are counted, the initial matrix vector is expanded, and the zero vector is filled beyond the initial matrix vector, so that a matrix vector with L1 x L2 dimensions is formed.
In some embodiments, the size of the convolution kernel in the convolution operation is 3*3.
The choice of parameters in the convolution operation determines the breadth and depth of the syndrome extraction. The number of convolution kernels represents the number of the extracted feature vectors, the number is too large, which leads to redundancy of the feature vectors, and the number is too small, which leads to incomplete feature vector extraction, so that semantic information of content name data in practical application needs to be analyzed, and the reasonable number of convolution kernels is selected. In the embodiment of the application, the convolution kernel of 3*3 is adopted for feature extraction, so that the integrity of feature extraction is ensured.
In some embodiments, the calculation formula for the initial feature vector is:
where X is the initial matrix vector, W is the convolution kernel, i and j are the dimensions of matrix vector X, and m and n are the dimensions of matrix vector W. In the embodiment of the application, one-dimensional convolution operation is adopted for feature extraction.
In some embodiments, S103 may include:
s1031: and respectively carrying out pooling operation on each initial feature vector to obtain a first number of target feature vectors.
And (3) carrying out maximum pooling operation of p-dimension on x (M-k+1) x (N-k+1) matrix vectors obtained after convolution to finally obtain x [ (M-k+1) (N-k+1) ]/p matrix vectors. The pooling operation is used for the characteristic screening, redundancy removing and fusion processes, parameters of the pooling operation can be set according to actual needs, and redundancy information is removed and the characteristic vector dimension is reduced under the condition that semantic information is reserved as much as possible.
In some embodiments, the size of the pooling window in the pooling operation is 2×2.
The choice of parameters in the pooling operation determines the feature extraction capacity and redundancy elimination. The size of the pooling window determines the redundancy removal degree of feature extraction, if the window is too large, a part of important vector values will be lost, and if the window is too small, too many redundancy values will be reserved, so that an appropriate window size needs to be adopted, redundancy is removed under the condition of retaining important information to the greatest extent, for example, the pooling window size of 2 x 2 is adopted.
In some embodiments, the calculation formula of the target feature vector is:
P(r,s)=max p {S(q,l)}
wherein S is the initial feature vector, P is the target feature vector, P is the pool area size, r and S are the dimensions of the target feature vector, and q and l are the dimensions of the initial feature vector.
In some embodiments, after S103, the content name data processing method further includes:
s106: screening the first number of target feature vectors, removing the same target feature vectors, and obtaining a second number of target feature vectors;
accordingly, S104 may include:
s1041: and linearly combining the second number of target feature vectors to obtain target code values.
In the embodiment of the application, the target feature vector is screened, redundant data is removed, and the calculation efficiency is improved.
It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present application.
Referring to fig. 2, an embodiment of the present application further provides a content name data processing apparatus, including:
the encoding module 21 is configured to obtain content name data to be processed, and convert the content name data to be processed into an initial matrix vector according to a preset query dictionary;
the feature extraction module 22 is configured to perform feature extraction on the initial matrix vectors to obtain a first number of initial feature vectors;
the dimension reduction module 23 is configured to reduce dimensions of each initial feature vector to obtain a first number of target feature vectors;
the linear combination module 24 is configured to perform linear combination on the first number of target feature vectors to obtain a target code value.
In some embodiments, feature extraction module 22 may include:
the convolution unit 221 is configured to perform a convolution operation on the initial matrix vectors to obtain a first number of initial feature vectors.
In some embodiments, the size of the convolution kernel in the convolution operation is 3*3.
In some embodiments, the calculation formula for the initial feature vector is:
where X is the initial matrix vector, W is the convolution kernel, i and j are the dimensions of matrix vector X, and m and n are the dimensions of matrix vector W.
In some embodiments the dimension reduction module 23 may include:
the pooling unit 231 is configured to perform pooling operation on each initial feature vector, so as to obtain a first number of target feature vectors.
In some embodiments, the size of the pooling window in the pooling operation is 2×2.
In some embodiments, the calculation formula of the target feature vector is:
P(r,s)=max p {S(q,l)}
wherein S is the initial feature vector, P is the target feature vector, P is the pool area size, r and S are the dimensions of the target feature vector, and q and l are the dimensions of the initial feature vector.
In some embodiments, the content name data processing apparatus may further include:
a screening module 26, configured to screen the first number of target feature vectors, remove the same target feature vectors, and obtain a second number of target feature vectors;
correspondingly, the linear combination module 24 is further configured to perform linear combination on the second number of target feature vectors to obtain a target code value.
It will be clearly understood by those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional allocation may be performed by different functional units and modules, that is, the internal structure of the terminal device is divided into different functional units or modules, so as to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, the specific names of the functional units and modules are only for distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working process of the units and modules in the above device may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.
Fig. 3 is a schematic block diagram of a terminal device according to an embodiment of the present application. As shown in fig. 3, the terminal device 4 of this embodiment includes: one or more processors 40, a memory 41, and a computer program 42 stored in the memory 41 and executable on the processor 40. The steps in the respective embodiments of the content name data processing method described above, such as steps S101 to S104 shown in fig. 1, are implemented when the processor 40 executes the computer program 42. Alternatively, the processor 40, when executing the computer program 42, implements the functions of the modules/units of the embodiment of the content name data processing device described above, such as the functions of the modules 21 to 24 shown in fig. 2.
Illustratively, the computer program 42 may be partitioned into one or more modules/units, which are stored in the memory 41 and executed by the processor 40 to complete the present application. One or more of the modules/units may be a series of computer program instruction segments capable of performing a specific function for describing the execution of the computer program 42 in the terminal device 4. For example, the computer program 42 may be partitioned into an encoding module 21, a feature extraction module 22, a dimension reduction module 23, and a linear combination module 24.
The encoding module 21 is configured to obtain content name data to be processed, and convert the content name data to be processed into an initial matrix vector according to a preset query dictionary;
the feature extraction module 22 is configured to perform feature extraction on the initial matrix vectors to obtain a first number of initial feature vectors;
the dimension reduction module 23 is configured to reduce dimensions of each initial feature vector to obtain a first number of target feature vectors;
the linear combination module 24 is configured to perform linear combination on the first number of target feature vectors to obtain a target code value.
The terminal device 4 includes, but is not limited to, a processor 40, a memory 41. It will be appreciated by those skilled in the art that fig. 3 is only one example of a terminal device and does not constitute a limitation of the terminal device 4, and may include more or less components than illustrated, or may combine certain components, or different components, e.g., the terminal device 4 may also include an input device, an output device, a network access device, a bus, etc.
The processor 40 may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 41 may be an internal storage unit of the terminal device, such as a hard disk or a memory of the terminal device. The memory 41 may also be an external storage device of the terminal device, such as a plug-in hard disk provided on the terminal device, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), or the like. Further, the memory 41 may also include both an internal storage unit of the terminal device and an external storage device. The memory 41 is used for storing a computer program 42 and other programs and data required by the terminal device. The memory 41 may also be used to temporarily store data that has been output or is to be output.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and in part, not described or illustrated in any particular embodiment, reference is made to the related descriptions of other embodiments.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiments provided in the present application, it should be understood that the disclosed terminal device and method may be implemented in other manners. For example, the above-described terminal device embodiments are merely illustrative, e.g., the division of modules or units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection via interfaces, devices or units, which may be in electrical, mechanical or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated modules/units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present application may implement all or part of the flow of the method of the above embodiment, or may be implemented by a computer program to instruct related hardware, and the computer program may be stored in a computer readable storage medium, where the computer program, when executed by a processor, may implement the steps of each of the method embodiments described above. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, executable files or in some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the content of the computer readable medium can be appropriately increased or decreased according to the requirements of the jurisdiction's jurisdiction and the patent practice, for example, in some jurisdictions, the computer readable medium does not include electrical carrier signals and telecommunication signals according to the jurisdiction and the patent practice.
The above embodiments are only for illustrating the technical solution of the present application, and are not limiting; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application, and are intended to be included in the scope of the present application.
Claims (5)
1. A content name data processing method, characterized by comprising:
acquiring content name data to be processed, and converting the content name data to be processed into an initial matrix vector according to a preset query dictionary; the content name data to be processed comprises M characters, the preset query dictionary comprises a representation mode corresponding to each character, and the representation mode is a 1*N-dimensional vector;
extracting features of the initial matrix vectors to obtain a first number of initial feature vectors;
respectively performing dimension reduction on each initial feature vector to obtain the first number of target feature vectors;
linearly combining the first number of target feature vectors to obtain a target code value;
the converting the content name data to be processed into an initial matrix vector according to a preset query dictionary includes:
matching each character in the content name data to be processed with the characters in the preset query dictionary to obtain a representation mode of M characters;
combining the expression modes of the M characters to obtain an initial matrix vector of M-N dimensions;
the feature extraction is performed on the initial matrix vectors to obtain a first number of initial feature vectors, including:
performing convolution operation on the initial matrix vector of the M x N dimension and a convolution kernel of a first number of k x k dimensions to obtain an initial feature vector of the first number of (M-k+1) x (N-k+1) dimensions;
the convolution operation is performed on the m×n initial matrix vectors and a first number of k×k convolution kernels, so as to obtain the first number of (M-k+1) x (N-k+1) initial feature vectors, where the convolution operation includes:
expanding the initial matrix vector of M-dimension and N-dimension according to the size of the convolution kernel in the convolution operation to obtain an expanded initial matrix vector, and carrying out convolution operation on the expanded initial matrix vector to obtain the initial feature vector of the first number of (M-k+1) -dimension and (N-k+1) -dimension;
the step of performing dimension reduction on each initial feature vector to obtain the first number of target feature vectors includes:
respectively carrying out pooling operation on each initial feature vector to obtain the first number of target feature vectors; the calculation formula of the target feature vector is as follows:
P(r,s)=max p { S (M-k+1, N-k+1) }; wherein S represents an initial featureVectors, M-k+1 and N-k+1 represent the dimensions of the initial feature vector, P represents the target feature vector, P represents the size of the pooling region, max represents the pooling operation, and r and s represent the dimensions of the target feature vector;
the linear combination is used for analyzing the proportion of different target feature vectors in semantic information expression, distinguishing the importance of each target feature vector, integrating each target feature vector to obtain a target code value so as to keep the semantic information of the content name data;
after the dimension reduction is performed on each initial feature vector to obtain the first number of target feature vectors, the content name data processing method further includes:
screening the first number of target feature vectors, removing the same target feature vectors, and obtaining a second number of target feature vectors;
correspondingly, the linearly combining the first number of target feature vectors to obtain a target code value includes:
and linearly combining the second number of target feature vectors to obtain the target code value.
2. The content name data processing method according to claim 1, wherein a size of a pooling window in the pooling operation is 2 x 2.
3. A content name data processing apparatus, characterized by comprising:
the coding module is used for acquiring the name data of the content to be processed and converting the name data of the content to be processed into an initial matrix vector according to a preset query dictionary; the content name data to be processed comprises M characters, the preset query dictionary comprises a representation mode corresponding to each character, and the representation mode is a 1*N-dimensional vector;
the feature extraction module is used for extracting features of the initial matrix vectors to obtain a first number of initial feature vectors;
the dimension reduction module is used for respectively reducing the dimension of each initial feature vector to obtain the first number of target feature vectors;
the linear combination module is used for carrying out linear combination on the first number of target feature vectors to obtain target code values;
the converting the content name data to be processed into an initial matrix vector according to a preset query dictionary includes:
matching each character in the content name data to be processed with the characters in the preset query dictionary to obtain a representation mode of M characters;
combining the expression modes of the M characters to obtain an initial matrix vector of M-N dimensions;
the feature extraction is performed on the initial matrix vectors to obtain a first number of initial feature vectors, including:
performing convolution operation on the initial matrix vector of the M x N dimension and a convolution kernel of a first number of k x k dimensions to obtain an initial feature vector of the first number of (M-k+1) x (N-k+1) dimensions;
the convolution operation is performed on the m×n initial matrix vectors and a first number of k×k convolution kernels, so as to obtain the first number of (M-k+1) x (N-k+1) initial feature vectors, where the convolution operation includes:
expanding the initial matrix vector of M-dimension and N-dimension according to the size of the convolution kernel in the convolution operation to obtain an expanded initial matrix vector, and carrying out convolution operation on the expanded initial matrix vector to obtain the initial feature vector of the first number of (M-k+1) -dimension and (N-k+1) -dimension;
the dimension reduction module is used for respectively carrying out pooling operation on each initial feature vector to obtain the first number of target feature vectors; the calculation formula of the target feature vector is as follows:
P(r,s)=max p { S (M-k+1, N-k+1) }; wherein S represents an initial feature vector, M-k+1 and N-k+1 represent dimensions of the initial feature vector, P represents a target feature vector, P represents a pooling region size, max represents a pooling operation, and r and S represent dimensions of the target feature vector;
The linear combination is used for analyzing the proportion of different target feature vectors in semantic information expression, distinguishing the importance of each target feature vector, integrating each target feature vector to obtain a target code value so as to keep the semantic information of the content name data;
the screening module is used for screening the first number of target feature vectors, removing the same target feature vectors and obtaining a second number of target feature vectors;
correspondingly, the linear combination module is further configured to perform linear combination on the second number of target feature vectors to obtain the target code value.
4. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the content name data processing method according to any of claims 1 to 2 when the computer program is executed.
5. A computer-readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the content name data processing method according to any one of claims 1 to 2.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110212680.3A CN112800183B (en) | 2021-02-25 | 2021-02-25 | Content name data processing method and terminal equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110212680.3A CN112800183B (en) | 2021-02-25 | 2021-02-25 | Content name data processing method and terminal equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112800183A CN112800183A (en) | 2021-05-14 |
CN112800183B true CN112800183B (en) | 2023-09-26 |
Family
ID=75815847
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110212680.3A Active CN112800183B (en) | 2021-02-25 | 2021-02-25 | Content name data processing method and terminal equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112800183B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114510525B (en) * | 2022-04-18 | 2022-08-30 | 深圳丰尚智慧农牧科技有限公司 | Data format conversion method and device, computer equipment and storage medium |
Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1345441A (en) * | 1999-12-17 | 2002-04-17 | 索尼公司 | Information processor and processing method and information storage medium |
CN107562729A (en) * | 2017-09-14 | 2018-01-09 | 云南大学 | The Party building document representation method strengthened based on neutral net and theme |
CN107908757A (en) * | 2017-11-21 | 2018-04-13 | 恒安嘉新(北京)科技股份公司 | Website classification method and system |
CN108055529A (en) * | 2017-12-25 | 2018-05-18 | 国家电网公司 | Electric power unmanned plane and robot graphics' data normalization artificial intelligence analysis's system |
CN108804423A (en) * | 2018-05-30 | 2018-11-13 | 平安医疗健康管理股份有限公司 | Medical Text character extraction and automatic matching method and system |
CN109213975A (en) * | 2018-08-23 | 2019-01-15 | 重庆邮电大学 | It is a kind of that special document representation method is pushed away from coding based on character level convolution variation |
CN109255377A (en) * | 2018-08-30 | 2019-01-22 | 北京信立方科技发展股份有限公司 | Instrument recognition methods, device, electronic equipment and storage medium |
CN110019793A (en) * | 2017-10-27 | 2019-07-16 | 阿里巴巴集团控股有限公司 | A kind of text semantic coding method and device |
CN110162601A (en) * | 2019-05-22 | 2019-08-23 | 吉林大学 | A kind of biomedical publication submission recommender system based on deep learning |
JP2019149161A (en) * | 2018-02-27 | 2019-09-05 | 株式会社リコー | Method for generating word expression, device, and computer-readable storage medium |
CN110557439A (en) * | 2019-08-07 | 2019-12-10 | 中国联合网络通信集团有限公司 | Network content management method and block chain content network platform |
CN111339775A (en) * | 2020-02-11 | 2020-06-26 | 平安科技(深圳)有限公司 | Named entity identification method, device, terminal equipment and storage medium |
CN111666482A (en) * | 2019-03-06 | 2020-09-15 | 珠海格力电器股份有限公司 | Query method and device, storage medium and processor |
WO2020224219A1 (en) * | 2019-05-06 | 2020-11-12 | 平安科技(深圳)有限公司 | Chinese word segmentation method and apparatus, electronic device and readable storage medium |
CN112149710A (en) * | 2019-06-28 | 2020-12-29 | 英特尔公司 | Machine-generated content naming in information-centric networks |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11366990B2 (en) * | 2017-05-15 | 2022-06-21 | International Business Machines Corporation | Time-series representation learning via random time warping |
US20190251480A1 (en) * | 2018-02-09 | 2019-08-15 | NEC Laboratories Europe GmbH | Method and system for learning of classifier-independent node representations which carry class label information |
US11182559B2 (en) * | 2019-03-26 | 2021-11-23 | Siemens Aktiengesellschaft | System and method for natural language processing |
-
2021
- 2021-02-25 CN CN202110212680.3A patent/CN112800183B/en active Active
Patent Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1345441A (en) * | 1999-12-17 | 2002-04-17 | 索尼公司 | Information processor and processing method and information storage medium |
CN107562729A (en) * | 2017-09-14 | 2018-01-09 | 云南大学 | The Party building document representation method strengthened based on neutral net and theme |
CN110019793A (en) * | 2017-10-27 | 2019-07-16 | 阿里巴巴集团控股有限公司 | A kind of text semantic coding method and device |
CN107908757A (en) * | 2017-11-21 | 2018-04-13 | 恒安嘉新(北京)科技股份公司 | Website classification method and system |
CN108055529A (en) * | 2017-12-25 | 2018-05-18 | 国家电网公司 | Electric power unmanned plane and robot graphics' data normalization artificial intelligence analysis's system |
JP2019149161A (en) * | 2018-02-27 | 2019-09-05 | 株式会社リコー | Method for generating word expression, device, and computer-readable storage medium |
CN108804423A (en) * | 2018-05-30 | 2018-11-13 | 平安医疗健康管理股份有限公司 | Medical Text character extraction and automatic matching method and system |
CN109213975A (en) * | 2018-08-23 | 2019-01-15 | 重庆邮电大学 | It is a kind of that special document representation method is pushed away from coding based on character level convolution variation |
CN109255377A (en) * | 2018-08-30 | 2019-01-22 | 北京信立方科技发展股份有限公司 | Instrument recognition methods, device, electronic equipment and storage medium |
CN111666482A (en) * | 2019-03-06 | 2020-09-15 | 珠海格力电器股份有限公司 | Query method and device, storage medium and processor |
WO2020224219A1 (en) * | 2019-05-06 | 2020-11-12 | 平安科技(深圳)有限公司 | Chinese word segmentation method and apparatus, electronic device and readable storage medium |
CN110162601A (en) * | 2019-05-22 | 2019-08-23 | 吉林大学 | A kind of biomedical publication submission recommender system based on deep learning |
CN112149710A (en) * | 2019-06-28 | 2020-12-29 | 英特尔公司 | Machine-generated content naming in information-centric networks |
CN110557439A (en) * | 2019-08-07 | 2019-12-10 | 中国联合网络通信集团有限公司 | Network content management method and block chain content network platform |
CN111339775A (en) * | 2020-02-11 | 2020-06-26 | 平安科技(深圳)有限公司 | Named entity identification method, device, terminal equipment and storage medium |
Non-Patent Citations (4)
Title |
---|
内容中心网络移动终端数据优化挖掘模型仿真;李晓东;魏惠茹;;科技通报(10);全文 * |
徐洁磐.《人工智能导论》.中国铁道出版社有限公司,2019,第116-117页. * |
胡盼盼.《自然语言处理从入门到实践》.中国铁道出版社有限公司,2020,第54-56页. * |
软件定义的内容中心网络的多域分段路由机制;李根;伊鹏;张震;;计算机应用研究(09);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN112800183A (en) | 2021-05-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106549673B (en) | Data compression method and device | |
CN110830435A (en) | Method and device for extracting network flow space-time characteristics and detecting abnormity | |
JP6681313B2 (en) | Method, computer program and system for encoding data | |
CN106849956B (en) | Compression method, decompression method, device and data processing system | |
CN112800183B (en) | Content name data processing method and terminal equipment | |
CN114614829A (en) | Satellite data frame processing method and device, electronic equipment and readable storage medium | |
CN110769263A (en) | Image compression method and device and terminal equipment | |
CN108880559B (en) | Data compression method, data decompression method, compression equipment and decompression equipment | |
CN106293542B (en) | Method and device for decompressing file | |
CN111178513B (en) | Convolution implementation method and device of neural network and terminal equipment | |
CN111384972A (en) | Optimization method and device of multi-system LDPC decoding algorithm and decoder | |
WO2023159820A1 (en) | Image compression method, image decompression method, and apparatuses | |
CN104682966B (en) | The lossless compression method of table data | |
WO2022179355A1 (en) | Data processing method and apparatus for sample adaptive offset sideband compensating mode | |
CN115765756A (en) | Lossless data compression method, system and device for high-speed transparent transmission | |
CN111224674B (en) | Decoding method, device and decoder for multi-system LDPC code | |
CN111049836A (en) | Data processing method, electronic device and computer readable storage medium | |
CN110913220A (en) | Video frame coding method and device and terminal equipment | |
CN102395031B (en) | Data compression method | |
CN113595557B (en) | Data processing method and device | |
CN112686966B (en) | Lossless image compression method and device | |
CN108989813A (en) | A kind of high efficiency of compression/decompression method, computer installation and storage medium | |
CN115062673B (en) | Image processing method, image processing device, electronic equipment and storage medium | |
CN112669396B (en) | Lossless image compression method and device | |
CN112200301B (en) | Convolution computing device and method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |