CN112732890A - Population data feature extraction method and device and terminal equipment - Google Patents
Population data feature extraction method and device and terminal equipment Download PDFInfo
- Publication number
- CN112732890A CN112732890A CN202011567356.5A CN202011567356A CN112732890A CN 112732890 A CN112732890 A CN 112732890A CN 202011567356 A CN202011567356 A CN 202011567356A CN 112732890 A CN112732890 A CN 112732890A
- Authority
- CN
- China
- Prior art keywords
- data
- feature extraction
- population
- text
- population data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000605 extraction Methods 0.000 title claims abstract description 158
- 239000013598 vector Substances 0.000 claims abstract description 51
- 238000000034 method Methods 0.000 claims abstract description 28
- 238000012544 monitoring process Methods 0.000 claims abstract description 16
- 238000006243 chemical reaction Methods 0.000 claims abstract description 9
- 238000012549 training Methods 0.000 claims description 21
- 238000004590 computer program Methods 0.000 claims description 17
- 230000015654 memory Effects 0.000 claims description 14
- 230000008859 change Effects 0.000 claims description 8
- 230000004927 fusion Effects 0.000 claims description 4
- 230000009471 action Effects 0.000 description 6
- 238000004891 communication Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000003062 neural network model Methods 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/335—Filtering based on additional data, e.g. user or group profiles
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
- G06F16/313—Selection or weighting of terms for indexing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/26—Government or public services
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Tourism & Hospitality (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Databases & Information Systems (AREA)
- Molecular Biology (AREA)
- Development Economics (AREA)
- Educational Administration (AREA)
- Economics (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Probability & Statistics with Applications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a population data feature extraction method, a population data feature extraction device and terminal equipment, wherein the method comprises the following steps: acquiring population data of a target population monitoring system; inputting the non-text data in each population data into a preset feature extraction model to obtain a non-text feature vector corresponding to each population data; extracting effective keywords of text data in each population data to obtain effective keyword groups corresponding to each population data, and performing fuzzification conversion on the effective keyword groups to obtain text feature vectors corresponding to each population data; and determining a feature extraction result corresponding to each population data based on the non-text feature vector corresponding to each population data and the text feature vector corresponding to each population data. The population data feature extraction method, the population data feature extraction device and the terminal equipment can more accurately and comprehensively extract the features of population data.
Description
Technical Field
The invention belongs to the technical field of data processing, and particularly relates to a population data feature extraction method, a population data feature extraction device and terminal equipment.
Background
With the rapid development of the internet and big data technology, various mass data are expanded at a high speed, the big data era is natural, the brand new thought and application technology of the big data era bring unique development prospects for government intelligent services and the like, and a population monitoring system also comes to the end.
As a large amount of population data is stored in a known population monitoring system, and the data types of the population data are complex, how to more accurately extract the data features of the population data to facilitate subsequent data processing becomes a problem that needs to be solved by those skilled in the art.
Disclosure of Invention
The invention aims to provide a population data feature extraction method, a population data feature extraction device and terminal equipment, so as to improve the feature extraction precision of population data.
In a first aspect of the embodiments of the present invention, a population data feature extraction method is provided, where the population data feature extraction method is applied to a population monitoring system, and the method includes:
acquiring population data of a target population monitoring system;
inputting the non-text data in each population data into a preset feature extraction model to obtain a non-text feature vector corresponding to each population data;
extracting effective keywords of text data in each population data to obtain effective keyword groups corresponding to each population data, and performing fuzzification conversion on the effective keyword groups to obtain text feature vectors corresponding to each population data;
and determining a feature extraction result corresponding to each population data based on the non-text feature vector corresponding to each population data and the text feature vector corresponding to each population data.
In a second aspect of the embodiments of the present invention, there is provided a population data feature extraction device, where the population data feature extraction device is applied to a population monitoring system, and the population data feature extraction device includes:
the data acquisition module is used for acquiring population data of the target population monitoring system;
the first feature extraction module is used for inputting the non-text data in each population data into a preset feature extraction model to obtain a non-text feature vector corresponding to each population data;
the second feature extraction module is used for extracting effective keywords of the text data in each population data to obtain effective keyword groups corresponding to each population data, and performing fuzzification conversion on the effective keyword groups to obtain text feature vectors corresponding to each population data;
and the characteristic fusion module is used for determining a characteristic extraction result corresponding to each population data based on the non-text characteristic vector corresponding to each population data and the text characteristic vector corresponding to each population data.
In a third aspect of the embodiments of the present invention, there is provided a terminal device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the above-mentioned demographic data feature extraction method when executing the computer program.
In a fourth aspect of the embodiments of the present invention, a computer-readable storage medium is provided, where a computer program is stored, and the computer program, when executed by a processor, implements the steps of the above-mentioned demographic data feature extraction method.
The population data feature extraction method, the population data feature extraction device and the terminal equipment provided by the embodiment of the invention have the beneficial effects that:
the method is different from the scheme of directly extracting the features in the prior art, the text data and the non-text data in the population data are distinguished, the feature vectors of the text data and the non-text data are respectively extracted, and finally the feature extraction result of the population data is determined based on the text feature vectors and the non-text feature vectors. In other words, compared with the prior art, the method and the device can describe the population data more comprehensively and accurately, and realize more accurate data feature extraction.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed for the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
Fig. 1 is a schematic flow chart of a population data feature extraction method according to an embodiment of the present invention;
fig. 2 is a block diagram of a demographic data feature extraction apparatus according to an embodiment of the present invention;
fig. 3 is a schematic block diagram of a terminal device according to an embodiment of the present invention.
Detailed Description
In order to make the technical problems, technical solutions and advantageous effects to be solved by the present invention more clearly apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Referring to fig. 1, fig. 1 is a schematic flow chart of a population data feature extraction method provided in an embodiment of the present invention, the population data feature extraction method is applied to a population monitoring system, and the population data feature extraction method includes:
s101: and acquiring population data of the target population monitoring system.
S102: and inputting the non-text data in each population data into a preset feature extraction model to obtain a non-text feature vector corresponding to each population data.
S103: and extracting effective keywords of the text data in each population data to obtain effective keyword groups corresponding to each population data, and performing fuzzification conversion on the effective keyword groups corresponding to each population data to obtain text feature vectors corresponding to each population data.
In this embodiment, the fuzzification conversion is performed on the effective keyword group to obtain the text feature vector corresponding to each population data, which can be detailed as follows:
and for an effective keyword group corresponding to certain population data, performing fuzzification conversion on each effective keyword in the effective keyword group to obtain fuzzy quantities corresponding to each effective keyword, and combining the fuzzy quantities corresponding to each effective keyword to obtain a text feature vector corresponding to the population data.
S104: and determining a feature extraction result corresponding to each population data based on the non-text feature vector corresponding to each population data and the text feature vector corresponding to each population data.
In this embodiment, the feature extraction result corresponding to each demographic data is determined based on the non-text feature vector corresponding to each demographic data and the text feature vector corresponding to each demographic data, which may be detailed as follows:
combining the non-text feature vectors corresponding to the population data and the text feature vectors corresponding to the population data, and taking the feature vectors corresponding to the combined population data as feature extraction results corresponding to the population data.
The method is different from the scheme of directly extracting the features in the prior art, the text data and the non-text data in the population data are distinguished, the feature vectors of the text data and the non-text data are respectively extracted, and finally the feature extraction result of the population data is determined based on the text feature vectors and the non-text feature vectors. In other words, compared with the prior art, the method and the device can describe the population data more comprehensively and accurately, and realize more accurate data feature extraction.
Optionally, as a specific implementation manner of the population data feature extraction method provided in the embodiment of the present invention, the training method of the feature extraction model is as follows:
and acquiring population sample data, inputting the population sample data into a pre-constructed feature extraction network, and training to obtain a first feature extraction model.
The initial weight coefficient of the feature extraction network and the weight coefficient of the first feature extraction model are extracted, and a preferred weight coefficient is determined based on the initial weight coefficient of the feature extraction network and the weight coefficient of the first feature extraction model.
And randomly increasing the value of the optimal weight coefficient in the first feature extraction model according to a preset proportion to obtain a plurality of second feature extraction models, and inputting the population sample data into the plurality of second feature extraction models to obtain a plurality of groups of auxiliary sample data.
And performing secondary training on the first feature extraction model based on the population sample data and the multiple groups of auxiliary sample data to obtain a trained feature extraction model.
In this embodiment, since the randomly increased preferred weight coefficients are different and the obtained second feature extraction models are also different, the values of the preferred weight coefficients in the first feature extraction model can be randomly increased according to a preset ratio to obtain a plurality of second feature extraction models.
In the present embodiment, the weight coefficient preferably refers to a weight coefficient that greatly affects the feature extraction accuracy.
In this embodiment, the multiple sets of auxiliary sample data are auxiliary samples generated based on the change of the preferred weight coefficient, and the introduction of the auxiliary sample data into the training of the first feature extraction model can effectively improve the diversity of the sample data, thereby improving the training precision of the first feature extraction model.
Optionally, as a specific implementation manner of the population data feature extraction method provided by the embodiment of the present invention, determining the preferred weight coefficient based on the initialized weight coefficient of the feature extraction network and the weight coefficient of the first feature extraction model includes:
determining a rate of change of the weight coefficients based on the initialized weight coefficients of the feature extraction network and the weight coefficients of the first feature extraction model.
And taking the weight coefficient with the change rate larger than a preset threshold value as the preferred weight coefficient.
In this embodiment, the change rate of the weight coefficient is a ratio of a difference (here, an absolute value) between the weight coefficient of the first feature extraction model and the initialization weight coefficient of the feature extraction network to the initialization weight coefficient of the feature extraction network.
Optionally, as a specific implementation manner of the population data feature extraction method provided in the embodiment of the present invention, performing secondary training on the first feature extraction model based on population sample data and multiple sets of auxiliary sample data to obtain a trained feature extraction model, where the method includes:
and combining the human mouth sample data and the multiple groups of auxiliary sample data to obtain combined sample data, and performing secondary training on the first feature extraction model based on the combined sample data to obtain a trained feature extraction model.
In this embodiment, the population sample data and the multiple sets of auxiliary sample data may be randomly combined to obtain combined sample data, and the first feature extraction model is trained for the second time based on the combined sample data to obtain a trained feature extraction model.
Optionally, as a specific implementation manner of the population data feature extraction method provided in the embodiment of the present invention, for text data in certain population data, a method for extracting effective keywords in the text data includes:
and performing approximate search in a preset keyword library, and extracting candidate keywords in the text data and target keywords corresponding to the candidate keywords in the keyword library.
And inputting each candidate keyword and the target keyword corresponding to each candidate keyword into a preset matching model, and determining effective keywords corresponding to the text data.
In this embodiment, the valid keywords corresponding to the text data are the valid keywords corresponding to the population data, and all valid keywords corresponding to the text data form valid keyword groups corresponding to the population data.
In this embodiment, the text data in a certain population data may be a text data set describing the status of people in the target population monitoring system.
In this embodiment, the candidate keyword is a keyword similar to a target keyword in the keyword library, for example, a target keyword is "no action ability", and the phrase "lack of action ability" or "x action x ability" (where x represents a word or a word) exists in the text data, then the "no action ability" will become a target keyword corresponding to the "lack of action ability" or "x action x ability" (where x represents a word or a word) in the approximate search. That is, the approximate search searches some target keywords having phrase coincidence with the candidate keyword from the keyword library. It should be noted that the candidate keyword is a keyword already corresponding to the target keyword, and if there is a keyword in the population data, which is not matched with any target keyword in the keyword library, the keyword cannot be called a candidate keyword.
Optionally, as a specific implementation manner of the population data feature extraction method provided by the embodiment of the present invention, the preset matching model is a probabilistic neural network model.
Inputting each candidate keyword and a target keyword corresponding to each candidate keyword into a preset matching model, and determining an effective keyword corresponding to the text data, wherein the method comprises the following steps:
s41: and selecting a candidate keyword, and inputting the candidate keyword and a target keyword corresponding to the candidate keyword into a preset matching model to obtain the matching probability between the candidate keyword and the target keyword.
S42: and if the matching probability is greater than the preset probability value, taking the target keyword as an effective keyword. And if the matching probability is not greater than the preset probability value, deleting the candidate keyword.
S43: if all the candidate keywords have been traversed, each valid keyword obtained in step S42 is used as a valid keyword corresponding to the text data.
If all the candidate keywords are not traversed, the process returns to step 41.
In this embodiment, the probabilistic neural network model is configured to receive the candidate keyword and the target keyword corresponding to the candidate keyword, and output a matching probability of the candidate keyword and the target keyword. If the matching probability of the target keyword and the candidate keyword is not greater than the preset probability value, the matching is failed, the candidate keyword is deleted, if the matching probability of the target keyword and the candidate keyword is greater than the preset probability value, the matching is successful, and the target keyword is used as an effective keyword representing the candidate keyword, so that the subsequent calculation can be performed more conveniently.
Fig. 2 is a block diagram of a demographic data feature extraction apparatus according to an embodiment of the present invention, which corresponds to the demographic data feature extraction method according to the foregoing embodiment. For convenience of explanation, only portions related to the embodiments of the present invention are shown. Referring to fig. 2, the population data feature extraction device 20 is applied to a population monitoring system, and the population data feature extraction device 20 includes: the system comprises a data acquisition module 21, a first feature extraction module 22, a second feature extraction module 23 and a feature fusion module 24.
The data obtaining module 21 is configured to obtain population data of the target population monitoring system.
The first feature extraction module 22 is configured to input the non-text data in each population data into a preset feature extraction model to obtain a non-text feature vector corresponding to each population data.
The second feature extraction module 23 is configured to extract effective keywords of the text data in each population data to obtain effective keyword groups corresponding to each population data, and perform fuzzification conversion on the effective keyword groups to obtain text feature vectors corresponding to each population data.
And the feature fusion module 24 is configured to determine a feature extraction result corresponding to each demographic data based on the non-text feature vector corresponding to each demographic data and the text feature vector corresponding to each demographic data.
Optionally, referring to fig. 2, as a specific implementation manner of the population data feature extraction apparatus provided in the embodiment of the present invention, the population data feature extraction apparatus 20 may further include a model training module 25, where the model training module 25 is configured to:
and acquiring population sample data, inputting the population sample data into a pre-constructed feature extraction network, and training to obtain a first feature extraction model.
The initial weight coefficient of the feature extraction network and the weight coefficient of the first feature extraction model are extracted, and a preferred weight coefficient is determined based on the initial weight coefficient of the feature extraction network and the weight coefficient of the first feature extraction model.
And randomly increasing the value of the optimal weight coefficient in the first feature extraction model according to a preset proportion to obtain a plurality of second feature extraction models, and inputting the population sample data into the plurality of second feature extraction models to obtain a plurality of groups of auxiliary sample data.
And performing secondary training on the first feature extraction model based on the population sample data and the multiple groups of auxiliary sample data to obtain a trained feature extraction model.
Optionally, as a specific implementation manner of the population data feature extraction apparatus provided in the embodiment of the present invention, determining the preferred weight coefficient based on the initialized weight coefficient of the feature extraction network and the weight coefficient of the first feature extraction model includes:
determining a rate of change of the weight coefficients based on the initialized weight coefficients of the feature extraction network and the weight coefficients of the first feature extraction model.
And taking the weight coefficient with the change rate larger than a preset threshold value as the preferred weight coefficient.
Optionally, as a specific implementation manner of the population data feature extraction device provided in the embodiment of the present invention, performing secondary training on the first feature extraction model based on population sample data and multiple sets of auxiliary sample data to obtain a trained feature extraction model, including:
and combining the human mouth sample data and the multiple groups of auxiliary sample data to obtain combined sample data, and performing secondary training on the first feature extraction model based on the combined sample data to obtain a trained feature extraction model.
Optionally, as a specific implementation manner of the population data feature extraction device provided in the embodiment of the present invention, for text data in certain population data, a method for extracting effective keywords in the text data includes:
and performing approximate search in a preset keyword library, and extracting candidate keywords in the text data and target keywords corresponding to the candidate keywords in the keyword library.
And inputting each candidate keyword and the target keyword corresponding to each candidate keyword into a preset matching model, and determining effective keywords corresponding to the text data.
Optionally, as a specific implementation manner of the population data feature extraction device provided in the embodiment of the present invention, determining a feature extraction result corresponding to each population data based on the non-text feature vector corresponding to each population data and the text feature vector corresponding to each population data includes:
combining the non-text feature vectors corresponding to the population data and the text feature vectors corresponding to the population data, and taking the feature vectors corresponding to the combined population data as feature extraction results corresponding to the population data.
Referring to fig. 3, fig. 3 is a schematic block diagram of a terminal device according to an embodiment of the present invention. The terminal 300 in the present embodiment as shown in fig. 3 may include: one or more processors 301, one or more input devices 302, one or more output devices 303, and one or more memories 304. The processor 301, the input device 302, the output device 303, and the memory 304 are in communication with each other via a communication bus 305. The memory 304 is used to store a computer program comprising program instructions. Processor 301 is operative to execute program instructions stored in memory 304. Wherein the processor 301 is configured to call program instructions to perform the following functions for operating the modules/units in the above-described device embodiments, such as the functions of the modules 21 to 25 shown in fig. 2.
It should be understood that, in the embodiment of the present invention, the Processor 301 may be a Central Processing Unit (CPU), and the Processor may also be other general-purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The input device 302 may include a touch pad, a fingerprint sensor (for collecting fingerprint information of a user and direction information of the fingerprint), a microphone, etc., and the output device 303 may include a display (LCD, etc.), a speaker, etc.
The memory 304 may include a read-only memory and a random access memory, and provides instructions and data to the processor 301. A portion of the memory 304 may also include non-volatile random access memory. For example, the memory 304 may also store device type information.
In a specific implementation, the processor 301, the input device 302, and the output device 303 described in this embodiment of the present invention may execute the implementation manners described in the first embodiment and the second embodiment of the population data feature extraction method provided in this embodiment of the present invention, and may also execute the implementation manners of the terminal described in this embodiment of the present invention, which is not described herein again.
In another embodiment of the present invention, a computer-readable storage medium is provided, in which a computer program is stored, where the computer program includes program instructions, and the program instructions, when executed by a processor, implement all or part of the processes in the method of the above embodiments, and may also be implemented by a computer program instructing associated hardware, and the computer program may be stored in a computer-readable storage medium, and the computer program, when executed by a processor, may implement the steps of the above methods embodiments. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying computer program code, recording medium, U.S. disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution media, and the like. It should be noted that the computer readable medium may include any suitable increase or decrease as required by legislation and patent practice in the jurisdiction, for example, in some jurisdictions, computer readable media may not include electrical carrier signals and telecommunications signals in accordance with legislation and patent practice.
The computer readable storage medium may be an internal storage unit of the terminal of any of the foregoing embodiments, for example, a hard disk or a memory of the terminal. The computer readable storage medium may also be an external storage device of the terminal, such as a plug-in hard disk provided on the terminal, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like. Further, the computer-readable storage medium may also include both an internal storage unit and an external storage device of the terminal. The computer-readable storage medium is used for storing a computer program and other programs and data required by the terminal. The computer-readable storage medium may also be used to temporarily store data that has been output or is to be output.
Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the examples have been described in a functional general in the foregoing description for the purpose of illustrating clearly the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the terminal and the unit described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed terminal and method can be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces or units, and may also be an electrical, mechanical or other form of connection.
Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present invention.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
While the invention has been described with reference to specific embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (10)
1. A population data feature extraction method is applied to a population monitoring system, and comprises the following steps:
acquiring population data of a target population monitoring system;
inputting the non-text data in each population data into a preset feature extraction model to obtain a non-text feature vector corresponding to each population data;
extracting effective keywords of text data in each population data to obtain effective keyword groups corresponding to each population data, and performing fuzzification conversion on the effective keyword groups to obtain text feature vectors corresponding to each population data;
and determining a feature extraction result corresponding to each population data based on the non-text feature vector corresponding to each population data and the text feature vector corresponding to each population data.
2. The method of extracting demographic data as set forth in claim 1, wherein the training method of the feature extraction model is:
acquiring population sample data, inputting the population sample data into a pre-constructed feature extraction network, and training to obtain a first feature extraction model;
extracting an initialization weight coefficient of the feature extraction network and a weight coefficient of the first feature extraction model, and determining a preferred weight coefficient based on the initialization weight coefficient of the feature extraction network and the weight coefficient of the first feature extraction model;
randomly increasing the value of the optimal weight coefficient in the first feature extraction model according to a preset proportion to obtain a plurality of second feature extraction models, and inputting the population sample data into the plurality of second feature extraction models to obtain a plurality of groups of auxiliary sample data;
and performing secondary training on the first feature extraction model based on the population sample data and the multiple groups of auxiliary sample data to obtain a trained feature extraction model.
3. The method of demographic data feature extraction of claim 2, wherein the determining a preferred weight coefficient based on the initialized weight coefficient for the feature extraction network and the weight coefficient for the first feature extraction model comprises:
determining a rate of change of weight coefficients based on the initialized weight coefficients of the feature extraction network and the weight coefficients of the first feature extraction model;
and taking the weight coefficient with the change rate larger than a preset threshold value as the preferred weight coefficient.
4. The method of claim 2, wherein the training the first feature extraction model a second time based on the population sample data and the plurality of sets of auxiliary sample data to obtain a trained feature extraction model comprises:
and combining the population sample data and the multiple groups of auxiliary sample data to obtain combined sample data, and performing secondary training on the first feature extraction model based on the combined sample data to obtain a trained feature extraction model.
5. The method for extracting demographic data characteristics of claim 1, wherein for a text data in a certain demographic data, the method for extracting effective keywords in the text data comprises:
carrying out approximate search in a preset keyword library, and extracting candidate keywords in the text data and target keywords corresponding to the candidate keywords in the keyword library;
and inputting each candidate keyword and the target keyword corresponding to each candidate keyword into a preset matching model, and determining effective keywords corresponding to the text data.
6. The method as claimed in claim 1, wherein the determining the feature extraction result corresponding to each demographic data based on the non-text feature vector corresponding to each demographic data and the text feature vector corresponding to each demographic data comprises:
combining the non-text feature vectors corresponding to the population data and the text feature vectors corresponding to the population data, and taking the feature vectors corresponding to the combined population data as feature extraction results corresponding to the population data.
7. The population data feature extraction device is applied to a population monitoring system, and comprises the following components:
the data acquisition module is used for acquiring population data of the target population monitoring system;
the first feature extraction module is used for inputting the non-text data in each population data into a preset feature extraction model to obtain a non-text feature vector corresponding to each population data;
the second feature extraction module is used for extracting effective keywords of the text data in each population data to obtain effective keyword groups corresponding to each population data, and performing fuzzification conversion on the effective keyword groups to obtain text feature vectors corresponding to each population data;
and the characteristic fusion module is used for determining a characteristic extraction result corresponding to each population data based on the non-text characteristic vector corresponding to each population data and the text characteristic vector corresponding to each population data.
8. The demographic data feature extraction device of claim 7, further comprising a model training module, wherein the model training module is configured to obtain demographic sample data, input the demographic sample data into a pre-constructed feature extraction network, and train to obtain a first feature extraction model;
extracting an initialization weight coefficient of the feature extraction network and a weight coefficient of the first feature extraction model, and determining a preferred weight coefficient based on the initialization weight coefficient of the feature extraction network and the weight coefficient of the first feature extraction model;
randomly increasing the value of the optimal weight coefficient in the first feature extraction model according to a preset proportion to obtain a plurality of second feature extraction models, and inputting the population sample data into the plurality of second feature extraction models to obtain a plurality of groups of auxiliary sample data;
and performing secondary training on the first feature extraction model based on the population sample data and the multiple groups of auxiliary sample data to obtain a trained feature extraction model.
9. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 6 when executing the computer program.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011567356.5A CN112732890A (en) | 2020-12-25 | 2020-12-25 | Population data feature extraction method and device and terminal equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011567356.5A CN112732890A (en) | 2020-12-25 | 2020-12-25 | Population data feature extraction method and device and terminal equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112732890A true CN112732890A (en) | 2021-04-30 |
Family
ID=75616593
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011567356.5A Pending CN112732890A (en) | 2020-12-25 | 2020-12-25 | Population data feature extraction method and device and terminal equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112732890A (en) |
-
2020
- 2020-12-25 CN CN202011567356.5A patent/CN112732890A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111860573B (en) | Model training method, image category detection method and device and electronic equipment | |
CN109918560B (en) | Question and answer method and device based on search engine | |
CN110472675B (en) | Image classification method, image classification device, storage medium and electronic equipment | |
CN112131383B (en) | Specific target emotion polarity classification method | |
CN111274365B (en) | Intelligent inquiry method and device based on semantic understanding, storage medium and server | |
CN113298152B (en) | Model training method, device, terminal equipment and computer readable storage medium | |
CN112632248A (en) | Question answering method, device, computer equipment and storage medium | |
CN117520503A (en) | Financial customer service dialogue generation method, device, equipment and medium based on LLM model | |
CN113887214A (en) | Artificial intelligence based wish presumption method and related equipment thereof | |
CN116186223A (en) | Financial text processing method, device, equipment and storage medium | |
CN111401069A (en) | Intention recognition method and intention recognition device for conversation text and terminal | |
CN112836045A (en) | Data processing method and device based on text data set and terminal equipment | |
CN112732890A (en) | Population data feature extraction method and device and terminal equipment | |
CN115221316A (en) | Knowledge base processing method, model training method, computer device and storage medium | |
CN115455142A (en) | Text retrieval method, computer device and storage medium | |
CN115438718A (en) | Emotion recognition method and device, computer readable storage medium and terminal equipment | |
CN112416754B (en) | Model evaluation method, terminal, system and storage medium | |
CN114676237A (en) | Sentence similarity determining method and device, computer equipment and storage medium | |
CN112597208A (en) | Enterprise name retrieval method, enterprise name retrieval device and terminal equipment | |
CN117235137B (en) | Professional information query method and device based on vector database | |
CN112712792A (en) | Dialect recognition model training method, readable storage medium and terminal device | |
CN116049446B (en) | Event extraction method, device, equipment and computer readable storage medium | |
CN113421575B (en) | Voiceprint recognition method, voiceprint recognition device, voiceprint recognition equipment and storage medium | |
US20230298326A1 (en) | Image augmentation method, electronic device and readable storage medium | |
CN115062284A (en) | Data duplicate checking method and device based on artificial intelligence, computer equipment and medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |