CN112732890A - Population data feature extraction method and device and terminal equipment - Google Patents

Population data feature extraction method and device and terminal equipment Download PDF

Info

Publication number
CN112732890A
CN112732890A CN202011567356.5A CN202011567356A CN112732890A CN 112732890 A CN112732890 A CN 112732890A CN 202011567356 A CN202011567356 A CN 202011567356A CN 112732890 A CN112732890 A CN 112732890A
Authority
CN
China
Prior art keywords
data
feature extraction
population
text
population data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011567356.5A
Other languages
Chinese (zh)
Inventor
吴少颖
王春友
蔡博乐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongke Hengyun Co ltd
Original Assignee
Zhongke Hengyun Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongke Hengyun Co ltd filed Critical Zhongke Hengyun Co ltd
Priority to CN202011567356.5A priority Critical patent/CN112732890A/en
Publication of CN112732890A publication Critical patent/CN112732890A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/313Selection or weighting of terms for indexing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Tourism & Hospitality (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Molecular Biology (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a population data feature extraction method, a population data feature extraction device and terminal equipment, wherein the method comprises the following steps: acquiring population data of a target population monitoring system; inputting the non-text data in each population data into a preset feature extraction model to obtain a non-text feature vector corresponding to each population data; extracting effective keywords of text data in each population data to obtain effective keyword groups corresponding to each population data, and performing fuzzification conversion on the effective keyword groups to obtain text feature vectors corresponding to each population data; and determining a feature extraction result corresponding to each population data based on the non-text feature vector corresponding to each population data and the text feature vector corresponding to each population data. The population data feature extraction method, the population data feature extraction device and the terminal equipment can more accurately and comprehensively extract the features of population data.

Description

Population data feature extraction method and device and terminal equipment
Technical Field
The invention belongs to the technical field of data processing, and particularly relates to a population data feature extraction method, a population data feature extraction device and terminal equipment.
Background
With the rapid development of the internet and big data technology, various mass data are expanded at a high speed, the big data era is natural, the brand new thought and application technology of the big data era bring unique development prospects for government intelligent services and the like, and a population monitoring system also comes to the end.
As a large amount of population data is stored in a known population monitoring system, and the data types of the population data are complex, how to more accurately extract the data features of the population data to facilitate subsequent data processing becomes a problem that needs to be solved by those skilled in the art.
Disclosure of Invention
The invention aims to provide a population data feature extraction method, a population data feature extraction device and terminal equipment, so as to improve the feature extraction precision of population data.
In a first aspect of the embodiments of the present invention, a population data feature extraction method is provided, where the population data feature extraction method is applied to a population monitoring system, and the method includes:
acquiring population data of a target population monitoring system;
inputting the non-text data in each population data into a preset feature extraction model to obtain a non-text feature vector corresponding to each population data;
extracting effective keywords of text data in each population data to obtain effective keyword groups corresponding to each population data, and performing fuzzification conversion on the effective keyword groups to obtain text feature vectors corresponding to each population data;
and determining a feature extraction result corresponding to each population data based on the non-text feature vector corresponding to each population data and the text feature vector corresponding to each population data.
In a second aspect of the embodiments of the present invention, there is provided a population data feature extraction device, where the population data feature extraction device is applied to a population monitoring system, and the population data feature extraction device includes:
the data acquisition module is used for acquiring population data of the target population monitoring system;
the first feature extraction module is used for inputting the non-text data in each population data into a preset feature extraction model to obtain a non-text feature vector corresponding to each population data;
the second feature extraction module is used for extracting effective keywords of the text data in each population data to obtain effective keyword groups corresponding to each population data, and performing fuzzification conversion on the effective keyword groups to obtain text feature vectors corresponding to each population data;
and the characteristic fusion module is used for determining a characteristic extraction result corresponding to each population data based on the non-text characteristic vector corresponding to each population data and the text characteristic vector corresponding to each population data.
In a third aspect of the embodiments of the present invention, there is provided a terminal device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the above-mentioned demographic data feature extraction method when executing the computer program.
In a fourth aspect of the embodiments of the present invention, a computer-readable storage medium is provided, where a computer program is stored, and the computer program, when executed by a processor, implements the steps of the above-mentioned demographic data feature extraction method.
The population data feature extraction method, the population data feature extraction device and the terminal equipment provided by the embodiment of the invention have the beneficial effects that:
the method is different from the scheme of directly extracting the features in the prior art, the text data and the non-text data in the population data are distinguished, the feature vectors of the text data and the non-text data are respectively extracted, and finally the feature extraction result of the population data is determined based on the text feature vectors and the non-text feature vectors. In other words, compared with the prior art, the method and the device can describe the population data more comprehensively and accurately, and realize more accurate data feature extraction.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed for the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
Fig. 1 is a schematic flow chart of a population data feature extraction method according to an embodiment of the present invention;
fig. 2 is a block diagram of a demographic data feature extraction apparatus according to an embodiment of the present invention;
fig. 3 is a schematic block diagram of a terminal device according to an embodiment of the present invention.
Detailed Description
In order to make the technical problems, technical solutions and advantageous effects to be solved by the present invention more clearly apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Referring to fig. 1, fig. 1 is a schematic flow chart of a population data feature extraction method provided in an embodiment of the present invention, the population data feature extraction method is applied to a population monitoring system, and the population data feature extraction method includes:
s101: and acquiring population data of the target population monitoring system.
S102: and inputting the non-text data in each population data into a preset feature extraction model to obtain a non-text feature vector corresponding to each population data.
S103: and extracting effective keywords of the text data in each population data to obtain effective keyword groups corresponding to each population data, and performing fuzzification conversion on the effective keyword groups corresponding to each population data to obtain text feature vectors corresponding to each population data.
In this embodiment, the fuzzification conversion is performed on the effective keyword group to obtain the text feature vector corresponding to each population data, which can be detailed as follows:
and for an effective keyword group corresponding to certain population data, performing fuzzification conversion on each effective keyword in the effective keyword group to obtain fuzzy quantities corresponding to each effective keyword, and combining the fuzzy quantities corresponding to each effective keyword to obtain a text feature vector corresponding to the population data.
S104: and determining a feature extraction result corresponding to each population data based on the non-text feature vector corresponding to each population data and the text feature vector corresponding to each population data.
In this embodiment, the feature extraction result corresponding to each demographic data is determined based on the non-text feature vector corresponding to each demographic data and the text feature vector corresponding to each demographic data, which may be detailed as follows:
combining the non-text feature vectors corresponding to the population data and the text feature vectors corresponding to the population data, and taking the feature vectors corresponding to the combined population data as feature extraction results corresponding to the population data.
The method is different from the scheme of directly extracting the features in the prior art, the text data and the non-text data in the population data are distinguished, the feature vectors of the text data and the non-text data are respectively extracted, and finally the feature extraction result of the population data is determined based on the text feature vectors and the non-text feature vectors. In other words, compared with the prior art, the method and the device can describe the population data more comprehensively and accurately, and realize more accurate data feature extraction.
Optionally, as a specific implementation manner of the population data feature extraction method provided in the embodiment of the present invention, the training method of the feature extraction model is as follows:
and acquiring population sample data, inputting the population sample data into a pre-constructed feature extraction network, and training to obtain a first feature extraction model.
The initial weight coefficient of the feature extraction network and the weight coefficient of the first feature extraction model are extracted, and a preferred weight coefficient is determined based on the initial weight coefficient of the feature extraction network and the weight coefficient of the first feature extraction model.
And randomly increasing the value of the optimal weight coefficient in the first feature extraction model according to a preset proportion to obtain a plurality of second feature extraction models, and inputting the population sample data into the plurality of second feature extraction models to obtain a plurality of groups of auxiliary sample data.
And performing secondary training on the first feature extraction model based on the population sample data and the multiple groups of auxiliary sample data to obtain a trained feature extraction model.
In this embodiment, since the randomly increased preferred weight coefficients are different and the obtained second feature extraction models are also different, the values of the preferred weight coefficients in the first feature extraction model can be randomly increased according to a preset ratio to obtain a plurality of second feature extraction models.
In the present embodiment, the weight coefficient preferably refers to a weight coefficient that greatly affects the feature extraction accuracy.
In this embodiment, the multiple sets of auxiliary sample data are auxiliary samples generated based on the change of the preferred weight coefficient, and the introduction of the auxiliary sample data into the training of the first feature extraction model can effectively improve the diversity of the sample data, thereby improving the training precision of the first feature extraction model.
Optionally, as a specific implementation manner of the population data feature extraction method provided by the embodiment of the present invention, determining the preferred weight coefficient based on the initialized weight coefficient of the feature extraction network and the weight coefficient of the first feature extraction model includes:
determining a rate of change of the weight coefficients based on the initialized weight coefficients of the feature extraction network and the weight coefficients of the first feature extraction model.
And taking the weight coefficient with the change rate larger than a preset threshold value as the preferred weight coefficient.
In this embodiment, the change rate of the weight coefficient is a ratio of a difference (here, an absolute value) between the weight coefficient of the first feature extraction model and the initialization weight coefficient of the feature extraction network to the initialization weight coefficient of the feature extraction network.
Optionally, as a specific implementation manner of the population data feature extraction method provided in the embodiment of the present invention, performing secondary training on the first feature extraction model based on population sample data and multiple sets of auxiliary sample data to obtain a trained feature extraction model, where the method includes:
and combining the human mouth sample data and the multiple groups of auxiliary sample data to obtain combined sample data, and performing secondary training on the first feature extraction model based on the combined sample data to obtain a trained feature extraction model.
In this embodiment, the population sample data and the multiple sets of auxiliary sample data may be randomly combined to obtain combined sample data, and the first feature extraction model is trained for the second time based on the combined sample data to obtain a trained feature extraction model.
Optionally, as a specific implementation manner of the population data feature extraction method provided in the embodiment of the present invention, for text data in certain population data, a method for extracting effective keywords in the text data includes:
and performing approximate search in a preset keyword library, and extracting candidate keywords in the text data and target keywords corresponding to the candidate keywords in the keyword library.
And inputting each candidate keyword and the target keyword corresponding to each candidate keyword into a preset matching model, and determining effective keywords corresponding to the text data.
In this embodiment, the valid keywords corresponding to the text data are the valid keywords corresponding to the population data, and all valid keywords corresponding to the text data form valid keyword groups corresponding to the population data.
In this embodiment, the text data in a certain population data may be a text data set describing the status of people in the target population monitoring system.
In this embodiment, the candidate keyword is a keyword similar to a target keyword in the keyword library, for example, a target keyword is "no action ability", and the phrase "lack of action ability" or "x action x ability" (where x represents a word or a word) exists in the text data, then the "no action ability" will become a target keyword corresponding to the "lack of action ability" or "x action x ability" (where x represents a word or a word) in the approximate search. That is, the approximate search searches some target keywords having phrase coincidence with the candidate keyword from the keyword library. It should be noted that the candidate keyword is a keyword already corresponding to the target keyword, and if there is a keyword in the population data, which is not matched with any target keyword in the keyword library, the keyword cannot be called a candidate keyword.
Optionally, as a specific implementation manner of the population data feature extraction method provided by the embodiment of the present invention, the preset matching model is a probabilistic neural network model.
Inputting each candidate keyword and a target keyword corresponding to each candidate keyword into a preset matching model, and determining an effective keyword corresponding to the text data, wherein the method comprises the following steps:
s41: and selecting a candidate keyword, and inputting the candidate keyword and a target keyword corresponding to the candidate keyword into a preset matching model to obtain the matching probability between the candidate keyword and the target keyword.
S42: and if the matching probability is greater than the preset probability value, taking the target keyword as an effective keyword. And if the matching probability is not greater than the preset probability value, deleting the candidate keyword.
S43: if all the candidate keywords have been traversed, each valid keyword obtained in step S42 is used as a valid keyword corresponding to the text data.
If all the candidate keywords are not traversed, the process returns to step 41.
In this embodiment, the probabilistic neural network model is configured to receive the candidate keyword and the target keyword corresponding to the candidate keyword, and output a matching probability of the candidate keyword and the target keyword. If the matching probability of the target keyword and the candidate keyword is not greater than the preset probability value, the matching is failed, the candidate keyword is deleted, if the matching probability of the target keyword and the candidate keyword is greater than the preset probability value, the matching is successful, and the target keyword is used as an effective keyword representing the candidate keyword, so that the subsequent calculation can be performed more conveniently.
Fig. 2 is a block diagram of a demographic data feature extraction apparatus according to an embodiment of the present invention, which corresponds to the demographic data feature extraction method according to the foregoing embodiment. For convenience of explanation, only portions related to the embodiments of the present invention are shown. Referring to fig. 2, the population data feature extraction device 20 is applied to a population monitoring system, and the population data feature extraction device 20 includes: the system comprises a data acquisition module 21, a first feature extraction module 22, a second feature extraction module 23 and a feature fusion module 24.
The data obtaining module 21 is configured to obtain population data of the target population monitoring system.
The first feature extraction module 22 is configured to input the non-text data in each population data into a preset feature extraction model to obtain a non-text feature vector corresponding to each population data.
The second feature extraction module 23 is configured to extract effective keywords of the text data in each population data to obtain effective keyword groups corresponding to each population data, and perform fuzzification conversion on the effective keyword groups to obtain text feature vectors corresponding to each population data.
And the feature fusion module 24 is configured to determine a feature extraction result corresponding to each demographic data based on the non-text feature vector corresponding to each demographic data and the text feature vector corresponding to each demographic data.
Optionally, referring to fig. 2, as a specific implementation manner of the population data feature extraction apparatus provided in the embodiment of the present invention, the population data feature extraction apparatus 20 may further include a model training module 25, where the model training module 25 is configured to:
and acquiring population sample data, inputting the population sample data into a pre-constructed feature extraction network, and training to obtain a first feature extraction model.
The initial weight coefficient of the feature extraction network and the weight coefficient of the first feature extraction model are extracted, and a preferred weight coefficient is determined based on the initial weight coefficient of the feature extraction network and the weight coefficient of the first feature extraction model.
And randomly increasing the value of the optimal weight coefficient in the first feature extraction model according to a preset proportion to obtain a plurality of second feature extraction models, and inputting the population sample data into the plurality of second feature extraction models to obtain a plurality of groups of auxiliary sample data.
And performing secondary training on the first feature extraction model based on the population sample data and the multiple groups of auxiliary sample data to obtain a trained feature extraction model.
Optionally, as a specific implementation manner of the population data feature extraction apparatus provided in the embodiment of the present invention, determining the preferred weight coefficient based on the initialized weight coefficient of the feature extraction network and the weight coefficient of the first feature extraction model includes:
determining a rate of change of the weight coefficients based on the initialized weight coefficients of the feature extraction network and the weight coefficients of the first feature extraction model.
And taking the weight coefficient with the change rate larger than a preset threshold value as the preferred weight coefficient.
Optionally, as a specific implementation manner of the population data feature extraction device provided in the embodiment of the present invention, performing secondary training on the first feature extraction model based on population sample data and multiple sets of auxiliary sample data to obtain a trained feature extraction model, including:
and combining the human mouth sample data and the multiple groups of auxiliary sample data to obtain combined sample data, and performing secondary training on the first feature extraction model based on the combined sample data to obtain a trained feature extraction model.
Optionally, as a specific implementation manner of the population data feature extraction device provided in the embodiment of the present invention, for text data in certain population data, a method for extracting effective keywords in the text data includes:
and performing approximate search in a preset keyword library, and extracting candidate keywords in the text data and target keywords corresponding to the candidate keywords in the keyword library.
And inputting each candidate keyword and the target keyword corresponding to each candidate keyword into a preset matching model, and determining effective keywords corresponding to the text data.
Optionally, as a specific implementation manner of the population data feature extraction device provided in the embodiment of the present invention, determining a feature extraction result corresponding to each population data based on the non-text feature vector corresponding to each population data and the text feature vector corresponding to each population data includes:
combining the non-text feature vectors corresponding to the population data and the text feature vectors corresponding to the population data, and taking the feature vectors corresponding to the combined population data as feature extraction results corresponding to the population data.
Referring to fig. 3, fig. 3 is a schematic block diagram of a terminal device according to an embodiment of the present invention. The terminal 300 in the present embodiment as shown in fig. 3 may include: one or more processors 301, one or more input devices 302, one or more output devices 303, and one or more memories 304. The processor 301, the input device 302, the output device 303, and the memory 304 are in communication with each other via a communication bus 305. The memory 304 is used to store a computer program comprising program instructions. Processor 301 is operative to execute program instructions stored in memory 304. Wherein the processor 301 is configured to call program instructions to perform the following functions for operating the modules/units in the above-described device embodiments, such as the functions of the modules 21 to 25 shown in fig. 2.
It should be understood that, in the embodiment of the present invention, the Processor 301 may be a Central Processing Unit (CPU), and the Processor may also be other general-purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The input device 302 may include a touch pad, a fingerprint sensor (for collecting fingerprint information of a user and direction information of the fingerprint), a microphone, etc., and the output device 303 may include a display (LCD, etc.), a speaker, etc.
The memory 304 may include a read-only memory and a random access memory, and provides instructions and data to the processor 301. A portion of the memory 304 may also include non-volatile random access memory. For example, the memory 304 may also store device type information.
In a specific implementation, the processor 301, the input device 302, and the output device 303 described in this embodiment of the present invention may execute the implementation manners described in the first embodiment and the second embodiment of the population data feature extraction method provided in this embodiment of the present invention, and may also execute the implementation manners of the terminal described in this embodiment of the present invention, which is not described herein again.
In another embodiment of the present invention, a computer-readable storage medium is provided, in which a computer program is stored, where the computer program includes program instructions, and the program instructions, when executed by a processor, implement all or part of the processes in the method of the above embodiments, and may also be implemented by a computer program instructing associated hardware, and the computer program may be stored in a computer-readable storage medium, and the computer program, when executed by a processor, may implement the steps of the above methods embodiments. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying computer program code, recording medium, U.S. disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution media, and the like. It should be noted that the computer readable medium may include any suitable increase or decrease as required by legislation and patent practice in the jurisdiction, for example, in some jurisdictions, computer readable media may not include electrical carrier signals and telecommunications signals in accordance with legislation and patent practice.
The computer readable storage medium may be an internal storage unit of the terminal of any of the foregoing embodiments, for example, a hard disk or a memory of the terminal. The computer readable storage medium may also be an external storage device of the terminal, such as a plug-in hard disk provided on the terminal, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like. Further, the computer-readable storage medium may also include both an internal storage unit and an external storage device of the terminal. The computer-readable storage medium is used for storing a computer program and other programs and data required by the terminal. The computer-readable storage medium may also be used to temporarily store data that has been output or is to be output.
Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the examples have been described in a functional general in the foregoing description for the purpose of illustrating clearly the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the terminal and the unit described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed terminal and method can be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces or units, and may also be an electrical, mechanical or other form of connection.
Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present invention.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
While the invention has been described with reference to specific embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A population data feature extraction method is applied to a population monitoring system, and comprises the following steps:
acquiring population data of a target population monitoring system;
inputting the non-text data in each population data into a preset feature extraction model to obtain a non-text feature vector corresponding to each population data;
extracting effective keywords of text data in each population data to obtain effective keyword groups corresponding to each population data, and performing fuzzification conversion on the effective keyword groups to obtain text feature vectors corresponding to each population data;
and determining a feature extraction result corresponding to each population data based on the non-text feature vector corresponding to each population data and the text feature vector corresponding to each population data.
2. The method of extracting demographic data as set forth in claim 1, wherein the training method of the feature extraction model is:
acquiring population sample data, inputting the population sample data into a pre-constructed feature extraction network, and training to obtain a first feature extraction model;
extracting an initialization weight coefficient of the feature extraction network and a weight coefficient of the first feature extraction model, and determining a preferred weight coefficient based on the initialization weight coefficient of the feature extraction network and the weight coefficient of the first feature extraction model;
randomly increasing the value of the optimal weight coefficient in the first feature extraction model according to a preset proportion to obtain a plurality of second feature extraction models, and inputting the population sample data into the plurality of second feature extraction models to obtain a plurality of groups of auxiliary sample data;
and performing secondary training on the first feature extraction model based on the population sample data and the multiple groups of auxiliary sample data to obtain a trained feature extraction model.
3. The method of demographic data feature extraction of claim 2, wherein the determining a preferred weight coefficient based on the initialized weight coefficient for the feature extraction network and the weight coefficient for the first feature extraction model comprises:
determining a rate of change of weight coefficients based on the initialized weight coefficients of the feature extraction network and the weight coefficients of the first feature extraction model;
and taking the weight coefficient with the change rate larger than a preset threshold value as the preferred weight coefficient.
4. The method of claim 2, wherein the training the first feature extraction model a second time based on the population sample data and the plurality of sets of auxiliary sample data to obtain a trained feature extraction model comprises:
and combining the population sample data and the multiple groups of auxiliary sample data to obtain combined sample data, and performing secondary training on the first feature extraction model based on the combined sample data to obtain a trained feature extraction model.
5. The method for extracting demographic data characteristics of claim 1, wherein for a text data in a certain demographic data, the method for extracting effective keywords in the text data comprises:
carrying out approximate search in a preset keyword library, and extracting candidate keywords in the text data and target keywords corresponding to the candidate keywords in the keyword library;
and inputting each candidate keyword and the target keyword corresponding to each candidate keyword into a preset matching model, and determining effective keywords corresponding to the text data.
6. The method as claimed in claim 1, wherein the determining the feature extraction result corresponding to each demographic data based on the non-text feature vector corresponding to each demographic data and the text feature vector corresponding to each demographic data comprises:
combining the non-text feature vectors corresponding to the population data and the text feature vectors corresponding to the population data, and taking the feature vectors corresponding to the combined population data as feature extraction results corresponding to the population data.
7. The population data feature extraction device is applied to a population monitoring system, and comprises the following components:
the data acquisition module is used for acquiring population data of the target population monitoring system;
the first feature extraction module is used for inputting the non-text data in each population data into a preset feature extraction model to obtain a non-text feature vector corresponding to each population data;
the second feature extraction module is used for extracting effective keywords of the text data in each population data to obtain effective keyword groups corresponding to each population data, and performing fuzzification conversion on the effective keyword groups to obtain text feature vectors corresponding to each population data;
and the characteristic fusion module is used for determining a characteristic extraction result corresponding to each population data based on the non-text characteristic vector corresponding to each population data and the text characteristic vector corresponding to each population data.
8. The demographic data feature extraction device of claim 7, further comprising a model training module, wherein the model training module is configured to obtain demographic sample data, input the demographic sample data into a pre-constructed feature extraction network, and train to obtain a first feature extraction model;
extracting an initialization weight coefficient of the feature extraction network and a weight coefficient of the first feature extraction model, and determining a preferred weight coefficient based on the initialization weight coefficient of the feature extraction network and the weight coefficient of the first feature extraction model;
randomly increasing the value of the optimal weight coefficient in the first feature extraction model according to a preset proportion to obtain a plurality of second feature extraction models, and inputting the population sample data into the plurality of second feature extraction models to obtain a plurality of groups of auxiliary sample data;
and performing secondary training on the first feature extraction model based on the population sample data and the multiple groups of auxiliary sample data to obtain a trained feature extraction model.
9. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 6 when executing the computer program.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 6.
CN202011567356.5A 2020-12-25 2020-12-25 Population data feature extraction method and device and terminal equipment Pending CN112732890A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011567356.5A CN112732890A (en) 2020-12-25 2020-12-25 Population data feature extraction method and device and terminal equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011567356.5A CN112732890A (en) 2020-12-25 2020-12-25 Population data feature extraction method and device and terminal equipment

Publications (1)

Publication Number Publication Date
CN112732890A true CN112732890A (en) 2021-04-30

Family

ID=75616593

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011567356.5A Pending CN112732890A (en) 2020-12-25 2020-12-25 Population data feature extraction method and device and terminal equipment

Country Status (1)

Country Link
CN (1) CN112732890A (en)

Similar Documents

Publication Publication Date Title
CN111860573B (en) Model training method, image category detection method and device and electronic equipment
CN109918560B (en) Question and answer method and device based on search engine
CN110472675B (en) Image classification method, image classification device, storage medium and electronic equipment
CN112131383B (en) Specific target emotion polarity classification method
CN111274365B (en) Intelligent inquiry method and device based on semantic understanding, storage medium and server
CN113298152B (en) Model training method, device, terminal equipment and computer readable storage medium
CN112632248A (en) Question answering method, device, computer equipment and storage medium
CN117520503A (en) Financial customer service dialogue generation method, device, equipment and medium based on LLM model
CN113887214A (en) Artificial intelligence based wish presumption method and related equipment thereof
CN116186223A (en) Financial text processing method, device, equipment and storage medium
CN111401069A (en) Intention recognition method and intention recognition device for conversation text and terminal
CN112836045A (en) Data processing method and device based on text data set and terminal equipment
CN112732890A (en) Population data feature extraction method and device and terminal equipment
CN115221316A (en) Knowledge base processing method, model training method, computer device and storage medium
CN115455142A (en) Text retrieval method, computer device and storage medium
CN115438718A (en) Emotion recognition method and device, computer readable storage medium and terminal equipment
CN112416754B (en) Model evaluation method, terminal, system and storage medium
CN114676237A (en) Sentence similarity determining method and device, computer equipment and storage medium
CN112597208A (en) Enterprise name retrieval method, enterprise name retrieval device and terminal equipment
CN117235137B (en) Professional information query method and device based on vector database
CN112712792A (en) Dialect recognition model training method, readable storage medium and terminal device
CN116049446B (en) Event extraction method, device, equipment and computer readable storage medium
CN113421575B (en) Voiceprint recognition method, voiceprint recognition device, voiceprint recognition equipment and storage medium
US20230298326A1 (en) Image augmentation method, electronic device and readable storage medium
CN115062284A (en) Data duplicate checking method and device based on artificial intelligence, computer equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination