CN116644184B - Human resource information management system based on data clustering - Google Patents

Human resource information management system based on data clustering Download PDF

Info

Publication number
CN116644184B
CN116644184B CN202310933469.XA CN202310933469A CN116644184B CN 116644184 B CN116644184 B CN 116644184B CN 202310933469 A CN202310933469 A CN 202310933469A CN 116644184 B CN116644184 B CN 116644184B
Authority
CN
China
Prior art keywords
dimension
resource information
human resource
keyword
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310933469.XA
Other languages
Chinese (zh)
Other versions
CN116644184A (en
Inventor
竹甜钿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Houxue Network Technology Co ltd
Original Assignee
Zhejiang Houxue Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Houxue Network Technology Co ltd filed Critical Zhejiang Houxue Network Technology Co ltd
Priority to CN202310933469.XA priority Critical patent/CN116644184B/en
Publication of CN116644184A publication Critical patent/CN116644184A/en
Application granted granted Critical
Publication of CN116644184B publication Critical patent/CN116644184B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/254Fusion techniques of classification results, e.g. of results related to same input data
    • G06F18/256Fusion techniques of classification results, e.g. of results related to same input data of results relating to different input data, e.g. multimodal recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/105Human resources
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Artificial Intelligence (AREA)
  • Human Resources & Organizations (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of data processing, in particular to a human resource information management system based on data clustering. The system comprises: the data acquisition module is used for acquiring keyword areas of each dimension of the human resource information and setting corresponding dimension weights; the comprehensive advantage analysis module is used for acquiring non-repeated texts in the keyword area and acquiring comprehensive advantage parameters of each dimension in each piece of human resource information; the dimension advantage analysis module is used for acquiring the same-dimension advantage parameters of each dimension according to the hierarchy weights of the same dimension of the human resource information; and clustering management is carried out on the human resource information by combining comprehensive advantage parameters of any dimension in each piece of human resource information, the same-dimension advantage parameters and the dimension weight. According to the invention, text information of different dimensionalities of the human resource information and the advantages of the same dimensionality are analyzed, so that the human resource information is clustered more accurately, and the management efficiency of the human resource information is improved.

Description

Human resource information management system based on data clustering
Technical Field
The invention relates to the technical field of data processing, in particular to a human resource information management system based on data clustering.
Background
Human resource management refers to the collective term of a series of management activities such as planning, recruiting, training, evaluating, and stimulating human resources by an enterprise or organization. Talent recruitment in human resources is mainly judged by screening resume delivered by job seekers, and talent training is mainly to acquire related training contents and the like from the internet. The capability level of various job seekers is uneven, so that the distribution of posts and the recording efficiency of the job seekers are low, and the management of human resource information of enterprises is very important.
In the prior art, word vectors of keywords of different types in human resource information are obtained by utilizing a latent semantic model, the word vectors of repeated keywords and the word vectors of non-repeated keywords are respectively weighted and summed according to summation factors of the word vectors, feature vectors of the human resource information are obtained, and the human resource information is managed based on the feature vectors. Because the human resource information contains a plurality of different types of information, and repeated information possibly exists in the different types of information, the characteristic vector can not accurately represent the human resource information due to the repeated information, so that a clustering result is inaccurate, and the management efficiency of the human resource information is further reduced.
Disclosure of Invention
In order to solve the technical problem that repeated information exists in different types of content of human resource information, so that a clustering result is inaccurate, the invention aims to provide a human resource information management system based on data clustering, and the adopted technical scheme is as follows:
the invention provides a human resource information management system based on data clustering, which comprises:
the data acquisition module is used for dividing each piece of human resource information into at least two dimensions according to different types of preset keywords, and one dimension corresponds to one type of preset keyword; setting dimension weights of all dimensions of each piece of human resource information; acquiring a keyword area of each dimension according to the position of a preset keyword in the human resource information;
the comprehensive advantage analysis module is used for acquiring a text which is not repeated in the keyword area; acquiring comprehensive advantage parameters of each dimension in each piece of human resource information according to the distribution confusion condition of non-repeated texts in a keyword area of any dimension and other arbitrary dimensions and the difference of the number of the non-repeated texts;
the dimension advantage analysis module is used for acquiring the hierarchical weight of each dimension of each piece of human resource information; according to the hierarchical weight of any one dimension of any one piece of human resource information and the hierarchical weight of the corresponding dimension of other human resource information, the same-dimension advantage parameter of each dimension of each piece of human resource information is obtained;
And the resource information management module is used for carrying out cluster management on the human resource information by combining the comprehensive advantage parameter of any dimension of each human resource information, the same-dimension advantage parameter and the dimension weight.
Further, the method for acquiring the keyword region comprises the following steps:
traversing all texts in each piece of human resource information; if the same text does not exist in the human resource information in each type of preset keywords, the text does not exist in the keyword area of the corresponding dimension of the preset keywords of the corresponding type, and the tag value of the corresponding dimension is set to be a first tag value;
if the same text exists in the human resource information in each type of preset keywords, the text is used as the keyword text of the corresponding dimension of the preset keywords of the corresponding type in the human resource information, and the label value of the corresponding dimension of the preset keywords of the corresponding type is set as a second label value; in each piece of human resource information, all texts in the area from the first keyword text in the front keyword text in the two adjacent keyword texts to the first keyword text in the rear keyword text are used as keyword areas of corresponding dimensions of the front keyword text; and taking all texts in the region from the first keyword text to the end of the human resource information in the last keyword text as the keyword region of the corresponding dimension of the last keyword text.
Further, the method for acquiring the comprehensive advantage parameter comprises the following steps:
substituting the probability of each non-repeated text in the keyword area of one dimension and the keyword area of the other dimension into an information entropy formula for any two dimensions of each piece of human resource information, and sequentially obtaining the text confusion degree of the corresponding two dimensions; substituting the probability of each non-repeated text in the two keyword areas corresponding to any two dimensions into an information entropy formula to obtain the text joint confusion corresponding to the two dimensions; taking the ratio of the sum of the text confusion corresponding to any two dimensions to the text joint confusion as a joint influence parameter corresponding to the two dimensions;
counting the number of the non-repeated texts in each keyword area;
taking the sum of the numbers of the non-repeated texts in the keyword areas of any two dimensions as a molecule, and taking the ratio obtained by taking the sum of the absolute value of the difference of the numbers and a preset constant as a denominator as a quantity adjustment value of the corresponding two dimensions; the quantity adjustment values of any two dimensions are used as weights of the joint influence parameters to be adjusted, so that quantity influence parameters corresponding to the two dimensions are obtained;
When the label value of the dimension of the human resource information is a first label value, the comprehensive advantage parameter of the corresponding dimension is 0; and carrying out negative correlation and normalized mapping on the average value of the quantity influence parameters of any dimension with the label value being the second label value and any other dimension to obtain the comprehensive advantage parameters of the corresponding dimension.
Further, the method for acquiring the same-dimensional dominant parameter comprises the following steps:
and taking the difference value between the hierarchical weight of any one dimension of each piece of human resource information and the average value of the hierarchical weights of the same dimension of other human resource information as the same-dimension advantage parameter of the corresponding dimension of each piece of human resource information.
Further, the method for performing cluster management on the human resource information by combining the comprehensive advantage parameter, the same-dimensional advantage parameter and the dimension weight of any dimension of each piece of human resource information comprises the following steps:
respectively carrying out normalization processing on the comprehensive advantage parameter and the same-dimensional advantage parameter to sequentially obtain a normalized comprehensive advantage parameter and a normalized single-dimensional advantage parameter;
taking a preset first adjustment coefficient as the weight of the normalized comprehensive advantage parameter, taking a preset second adjustment coefficient as the weight of the normalized single-dimensional advantage parameter, and carrying out weighted summation on the normalized comprehensive advantage parameter and the normalized single-dimensional advantage parameter of each dimension of each piece of human resource information to obtain a weighted influence parameter of each dimension of each piece of human resource information; obtaining a final dimension value of each dimension of each piece of human resource information by multiplying the weighted influence parameter of each dimension of each piece of human resource information by the dimension weight of the corresponding dimension;
Accumulating the final dimension values of each dimension of each piece of human resource information to obtain clustering parameters of the corresponding human resource information;
clustering the clustering parameters by using a clustering algorithm to obtain a preset number of clustering clusters; and displaying the human resource information corresponding to the clustering parameters in different clusters.
Further, the method for acquiring the non-repeated text comprises the following steps:
and removing repeated texts in each keyword area by using a text de-duplication algorithm to obtain non-repeated texts in the corresponding keyword area.
Further, the method for acquiring the hierarchical weight comprises the following steps:
when the label value of the dimension of the human resource information is the first label value, the hierarchical weight of the corresponding dimension is 0;
and analyzing texts in the keyword areas of any same dimension of all the human resource information by using a hierarchical analysis method based on preset screening content of each dimension to obtain the hierarchical weight of each dimension of each piece of human resource information.
The invention has the following beneficial effects:
in the embodiment of the invention, the clustering of the human resource information is based on the text information of each dimension, the keyword area of the dimension is the premise of carrying out the clustering analysis on the human resource information, the repeated information in each dimension, namely, the keyword area of different types, in the human resource information cannot provide effective information for the clustering process, the clustering effect of the human resource information is influenced, the distribution disorder condition of the non-repeated text in the keyword area of different dimensions of the same human resource information is presented, the influence of the distribution condition of the non-repeated text on the information among the dimensions is presented, the clustering effect is possibly influenced due to the number difference of the non-repeated text of different dimensions, and the influence condition among the dimensions is regulated through the number difference of the non-repeated text, so that the comprehensive advantage parameter can more accurately reflect the advantage information of the comprehensive dimension; the hierarchical weights represent the adaptation degree of the content of a certain dimension in the human resource information and the enterprise related content, and the hierarchical weights of the same dimension in all human resource information are compared, so that the advantage parameter of the same dimension can more accurately represent the advantage of a single dimension; based on comprehensive advantage parameters obtained by the mutual influence condition of repeated contents among different dimensions in the same resume, the comprehensive capability of each dimension in the resume can be measured, the same-dimension advantage parameters represent the importance degree of the same dimension in different human resource information, the dimension weight measures the importance degree of each dimension information in the human resource information to the enterprise related content, the characteristics of the human resource information can be accurately represented by three comprehensive factors, the human resource information can be clustered based on the characteristics of the human resource information, the phenomenon of dimension spells is avoided, the accuracy of human resource information clustering is increased, and the management efficiency of the human resource information is further improved.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions and advantages of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are only some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
Fig. 1 is a system block diagram of a human resource information management system based on data clustering according to an embodiment of the present invention.
Detailed Description
In order to further describe the technical means and effects adopted by the invention to achieve the preset aim, the following is a detailed description of specific implementation, structure, characteristics and effects of an intelligent monitoring system and a monitoring method for a construction hanging basket according to the invention with reference to the accompanying drawings and preferred embodiments. In the following description, different "one embodiment" or "another embodiment" means that the embodiments are not necessarily the same. Furthermore, the particular features, structures, or characteristics of one or more embodiments may be combined in any suitable manner.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
The invention aims at the specific scene: when the human resource information is managed, the human resource information is often required to be classified, but when the human resource information is clustered by the existing K-means clustering algorithm, the phenomenon of 'dimension curse' is easy to occur because of too much information of different types contained in the human resource information, so that the clustering result is inaccurate. On the basis of quantifying the resume information based on the carried information quantity and the personal ability advantage characteristics, the invention combines different types of information in the resume information to generate the clustering parameters for each resume so as to realize the clustering management of the human resource management information.
The following specifically describes a specific scheme of the human resource information management system based on data clustering provided by the invention with reference to the accompanying drawings.
Referring to fig. 1, a system block diagram of a human resource information management system based on data clustering according to an embodiment of the present invention is shown, where the system includes: the system comprises a data acquisition module 101, a comprehensive advantage analysis module 102, a dimension advantage analysis module 103 and a resource information management module 104.
The data acquisition module 101 is configured to divide each piece of human resource information into at least two dimensions according to preset keywords in different categories, where one dimension corresponds to one type of preset keyword; setting dimension weights of all dimensions of each piece of human resource information; and acquiring a keyword area of each dimension according to the position of the preset keyword in the human resource information.
Talent recruitment in human resources has an extremely important impact on the injection of fresh blood into businesses and companies. Talent recruitment in human resources is mainly judged by screening resume delivered by job seekers, various job seekers are uneven in capacity level for companies, job seekers are more, distribution of posts and recording efficiency of job seekers are low, and therefore classification of resume is very important for human resource information management of enterprises.
The information of the resume collected by enterprises is unstructured and chaotic, and the resume is easily affected by redundant irrelevant information when being clustered, so that the calculation amount is increased in the whole information analysis and clustering parameter acquisition process of each resume, and therefore, the resume information has structured unified analysis characteristics due to the fact that multidimensional information is required to be extracted from each resume. The embodiment of the invention performs cluster analysis on the resume with the fixed template format.
Specifically, in the embodiment of the invention, n resume received by an enterprise in one day is analyzed. In the prior art, title information in each resume in a resume database can be acquired through a text detection algorithm, and enterprises can limit specific preset keywords, such as educational experience, work experience, project experience and the like, according to self-requirements by counting contents of the title information. In the embodiment of the invention, the enterprise defines M preset keywords, and each preset keyword corresponds to one dimension in the resume, namely M dimensions in each resume.
Because the recruitment of enterprises has different importance degrees on each preset keyword, namely different desirability, the dimension weight of each dimension in each resume needs to be set. The specific dimension weight can be determined according to the research of human resource departments of enterprises, and an implementer can set the specific dimension weight according to actual conditions. The preset keywords corresponding to all dimensions in the invention are equally important for enterprise recruitment, so that the dimension weights of all dimensions in the resume are set to be equal, namelyWherein->Dimension weight of 1 st dimension in resume,/->Dimension weight for the m-th dimension in resume,>and M is the number of the dimensions in each resume.
Clustering the resume is based on text information corresponding to each dimension, and the keyword area of the dimension in the resume is the premise of performing cluster analysis on the resume.
Preferably, the keyword region acquiring method comprises the following steps: traversing all texts in each piece of human resource information; if the same text does not exist in the human resource information in each type of preset keywords, the text does not exist in the keyword area of the corresponding dimension of the preset keywords of the corresponding type, and the tag value of the corresponding dimension is set to be a first tag value; if the same text exists in the human resource information in each type of preset keywords, the text is used as the keyword text of the corresponding dimension of the preset keywords of the corresponding type in the human resource information, and the label value of the corresponding dimension of the preset keywords of the corresponding type is set as a second label value; in each piece of human resource information, all texts in the area from the first keyword text in the front keyword text in the two adjacent keyword texts to the first keyword text in the rear keyword text are used as keyword areas of corresponding dimensions of the front keyword text; and taking all texts in the region from the first keyword text to the end of the human resource information in the last keyword text as the keyword region of the corresponding dimension of the last keyword text.
As one example, the number of categories of preset keywords is equal to the number of dimensions in the resume. If the text which is the same as each type of preset keyword cannot be found in the resume, setting the label value of the corresponding dimension of the preset keyword of the type in the resume as a first label value, and enabling text information to be absent in the keyword area of the corresponding dimension of the preset keyword in the resume; if the same text is found, setting the label value of the corresponding dimension of the preset keyword of the corresponding category in the resume as a second label value, and then setting a keyword area in the corresponding dimension of the preset keyword of the category in the resume. In the embodiment of the invention, the first label value and the second label value are respectively set to be 0 and 1 in sequence, and an implementer can set the label value by himself. For convenience of description, the text in the resume is replaced by a number, and the text information in the resume is 12135121356789, wherein the keyword text in the 1 st dimension in the resume is 1, the keyword text in the 2 nd dimension is 4, and the keyword text in the 3 rd dimension is 5; at this time, the keyword region of the 1 st dimension corresponding to the keyword text 1 is 121351213, and the label value of the dimension is the second label value 1; no text exists in the 2 nd dimension keyword area corresponding to the keyword text 4, and the dimension tag value is a first tag value 0; the 3 rd dimension keyword area corresponding to the keyword text 5 is 56789, and the dimension tag value is the second tag value 1. Since the selection of the preset keywords is based on the brief description of a certain part of content in the resume, the keywords cannot be adjacent texts or the last text of the resume, and each keyword in the resume has a corresponding keyword area.
So far, the keyword area of each dimension of each resume is obtained.
The comprehensive advantage analysis module 102 is configured to obtain a text that is not repeated in the keyword region; and acquiring comprehensive advantage parameters of each dimension in each piece of human resource information according to the distribution confusion condition of the non-repeated text in the keyword area of any dimension and any other dimension and the difference of the number of the non-repeated text.
The larger the information quantity carried in the resume is, the more the enterprise can be helped to know job seekers, and the more the resume has advantages in different dimensions compared with other resume, the more the gap of the job seekers can be embodied, so the resume is a good resume for human resource management.
The information carried in the resume may have repeated information, and the repeated information of different dimensions in the resume affects the clustering effect, so that analysis is required for the text which is not repeated in the keyword area. The method for acquiring the non-repeated text comprises the following steps: and removing repeated texts in each keyword area by using a text de-duplication algorithm to obtain non-repeated texts in the corresponding keyword area. In the implementation of the invention, a minimum hash (MinHash) algorithm is selected to de-duplicate the text of the keyword area, so as to obtain the non-repeated text of the keyword area. The minimum hash algorithm is a well-known technology for those skilled in the art, and will not be described herein.
When the enterprise performs clustering judgment on the resume, the information carried in the resume is more comprehensive and more detailed, and a great amount of repeated information possibly exists among contents in different dimensions in the resume due to the difference of personal life histories of job seekers, so that certain dimensions in the resume cannot provide more effective information in the resume clustering process, and the resume clustering is easy to influence each other.
Repeated texts in a keyword area easily enable a certain dimension to not provide more effective information in a resume clustering process, influence of distribution conditions of the non-repeated texts on information among the dimensions is presented based on the distribution disorder conditions of the non-repeated texts in keyword areas of different dimensions of the same resume, clustering effects can be possibly influenced by differences of numbers of the non-repeated texts of different dimensions, influence conditions among the dimensions are adjusted through differences of numbers of the non-repeated texts, and comprehensive advantage parameters can reflect advantage information of the dimensions more accurately.
Preferably, the method for acquiring the comprehensive advantage parameters comprises the following steps: substituting the probability of each non-repeated text in the keyword area of one dimension and the keyword area of the other dimension into an information entropy formula for any two dimensions of each piece of human resource information, and sequentially obtaining the text confusion degree of the corresponding two dimensions; substituting the probability of each non-repeated text in the two keyword areas corresponding to any two dimensions into an information entropy formula to obtain the text joint confusion corresponding to the two dimensions; taking the ratio of the sum of the text confusion corresponding to any two dimensions to the text joint confusion as a joint influence parameter corresponding to the two dimensions; counting the number of the non-repeated texts in each keyword area; taking the sum of the numbers of the non-repeated texts in the keyword areas of any two dimensions as a molecule, and taking the ratio obtained by taking the sum of the absolute value of the difference of the numbers and a preset constant as a denominator as a quantity adjustment value of the corresponding two dimensions; the quantity adjustment value of any two dimensions is used as the weight of the combined influence parameter to be adjusted, so that the quantity influence parameter corresponding to the two dimensions is obtained; when the label value of the dimension of the human resource information is the first label value, the comprehensive advantage parameter of the corresponding dimension is 0; and carrying out negative correlation and normalized mapping on the average value of the number influence parameters of any dimension with the label value being the second label value and any other dimension to obtain the comprehensive advantage parameter of the corresponding dimension.
As an example, to facilitate understanding that text in a resume is replaced with numbers, assume text in the keyword region of the m-th dimension in the resume is 1, 2, 3, 4, 5, thThe text in the keyword area of each dimension is 2, 3, 4, 5 and 6, and the m dimension and the +.>The number of the non-repeated texts in the key word areas of each dimension is sequentially 5 and 5; mth dimension and->The non-repeated texts in the two keyword areas of each dimension are 1, 2, 3, 4, 5 and 6, and the number of the non-repeated texts is 6. When more repeated texts are in the keyword area with two dimensions, namely, the smaller the text joint confusion with two dimensions is, the larger the joint influence parameter is, the description is +.>The greater the effect of each dimension on the mth dimension. When the mth dimension exists and +.>When the dimension does not exist, the information entropy of the non-repeated text of the keyword area of the two dimensions is equal to the information entropy of the non-repeated text of the keyword area of the mth dimension, and the joint influence parameter of the two dimensions is->Equal to 1; when the mth dimension and->When the text content of two keyword areas of two dimensions is identical, the joint influence parameter of two dimensions is +.>Equal to 2. Under the condition that the number of non-repeated texts in the keyword areas in two dimensions is large in difference, even if the joint influence coefficient is large, the influence on the whole resume is small. For example, the number of non-repeated text in the mth dimension area +. >First->Number of non-repeated text of individual dimension area +.>Even if->The non-repeated text of the keyword region of the dimension is a subset of the non-repeated text of the keyword region of the m-th dimension, nor can it be stated +.>The influence of the individual dimensions on the mth dimension is greater, so in the joint influence coefficient +.>On the basis of the above, the difference of the number of the text which is not repeated in the two-dimensional area is utilized to carry out constraint, and the number influence parameters corresponding to the two dimensions are obtained>. When->The smaller the total number difference of non-repeated texts with each dimension corresponding to the dimension area of the m-th dimension, the number influencing parameter +.>The larger is->The more serious the influence of each dimension on the mth dimension, further explaining the weaker the degree of integration of the mth dimension. It should be noted that, when the tag value of the m dimension of the resume is the first tag value 0, the m preset keyword does not have the same text in the resume, i.e. the m dimension does not have a keyword area, and the m dimension is described as not having comprehensive advantages, the comprehensive advantage parameter is set to be 0; for the mth dimension with the tag value of the second tag value of 1, acquiring the quantity influence parameters of other dimensions and the mth dimension of the resume, carrying out negative correlation and normalized mapping on the average value of all quantity influence parameters, wherein the more serious the other dimensions influence the content of the mth dimension, the worse the comprehensive degree, and after carrying out negative correlation mapping, the stronger the comprehensive capacity of the mth dimension is explained, namely the comprehensive advantage parameter of the mth dimension is shown >The larger.
And combining the distribution confusion condition of the non-repeated texts in the key word areas of the m dimension and other dimensions of the resume and the number difference of the non-repeated texts to obtain the comprehensive advantage parameter of the m dimension of the resume. The calculation formula of the comprehensive advantage parameter is as follows:
in the method, in the process of the invention,the comprehensive advantage parameter of the m dimension of each resume; m is the number of dimensions in each resume, namely the number of categories of preset keywords; />Mth dimension and mth dimension for each resume>The number of individual dimensions influences the parameters, wherein +.>And->;/>The number of the non-repeated texts in the keyword area of the m dimension; />Is the firstThe number of non-repeated texts in the keyword area of each dimension; />Is->The joint influence parameter of the m dimension by the dimension; />The keyword region of the m-th dimension is the +.>Probability of occurrence of a non-duplicate text; />Is->The +.>Probability of occurrence of a non-duplicate text; />For the m-th dimension and +.>The number of non-repeated texts in the two keyword areas corresponding to the dimensions; />For the m-th dimension and +.>First +.in two keyword regions corresponding to each dimension>Probability of occurrence of a non-duplicate text; / >Taking an empirical value of 1 for a preset constant, and acting as a prevention of meaningless formula; />Is a logarithmic function based on a constant 2; />As a function of absolute value; e is a natural constant.
It should be noted that, when the m-th dimension and the m-th dimension of the resumeThe more the repeated content in the keyword area of each dimension, the smaller the text joint confusion of the two dimensions is compared with the sum of the text confusion of the two dimensions, resulting in the joint influence parameter of the two dimensionsCount->The larger; when the difference of the number of the non-repeated texts in the keyword areas of the two dimensions, the mutual influence of the text contents in the keyword areas corresponding to the two dimensions is influenced, so that the clustering of the overall resume is influenced, and the number adjustment value is +.>For the combined influencing parameters->Adjusting to improve the accuracy of the joint influence parameters; when the repeated text is more in the keyword areas of other dimensions and the mth dimension of the resume, the more the repeated text is +.>The larger the content of the mth dimension is, the more serious the influence of other dimensions is, the worse the comprehensive degree is, and the stronger the comprehensive capability of the mth dimension is, namely the comprehensive advantage parameter of the mth dimension is, after the negative correlation mapping is carried out>The larger the dimension is, the smaller the influence of the rest dimension on the mth dimension due to the repeated text is, and the more information the mth dimension provides when clustering is carried out; when the joint influence parameters are calculated, the numerator and the denominator are both information entropy formulas, and the negative sign in the information entropy formulas in the numerator and the denominator is divided.
According to the method for calculating the mutual influence parameters of the mth dimension of the resume, the comprehensive advantage parameters of each dimension of each resume are obtained.
So far, each dimension of each resume has a corresponding comprehensive advantage parameter.
A dimension dominance analysis module 103, configured to obtain a hierarchical weight of each dimension of each piece of human resource information; and acquiring the same-dimensional advantage parameters of each dimension of each piece of human resource information according to the hierarchical weight of any dimension of any piece of human resource information and the hierarchical weight of the corresponding dimension of other human resource information.
The enterprise filters the profile, and the capability of a certain dimension in a certain resume is very outstanding, which is very necessary when the enterprise filters professional talents through the resume, so that the hierarchical weight of each dimension in each resume needs to be obtained, and the hierarchical weight is used as a measure of the personal capability in the same dimension in each resume.
Preferably, the method for acquiring the hierarchical weight comprises the following steps: when the label value of the dimension of the human resource information is the first label value, the hierarchical weight of the corresponding dimension is 0; and analyzing texts in the keyword areas of any same dimension of all the human resource information by using a hierarchical analysis method based on preset screening content of each dimension to obtain hierarchical weights of each dimension of each piece of human resource information.
As an example, take the firstThe method is characterized in that the keyword is analyzed by taking working experience as an example, the keyword area of each dimension in the resume corresponds to the corresponding content requirement of the enterprise post recruitment, and the preset screening content of each dimension is the corresponding content requirement of the enterprise post recruitment. If the content of the keyword area of the "working experience" dimension in the 1 st resume is: the content of the keyword area of the "working experience" dimension in the 2 nd resume, which has the working experience of the service industry of 3 years, is: with->The method is characterized in that the working experience of the annual computer industry is that the preset screening content of the 'working experience' dimension in the enterprise post recruitment requirement is assumed to be 'computer', and the 1 st and 2 nd resume weights, namely the hierarchy weights, relative to the weights of the enterprise post recruitment can be obtained by combining a hierarchy analysis method on the preset screening content of the 'working experience' dimension, and the larger the hierarchy weights, the more suitable the description, the lower the hierarchy weight of the 1 st part in the 'working experience' dimension is than the hierarchy weight of the 2 nd part. The analytic hierarchy process is a well known technique to those skilled in the art, and is not described herein.
The similar advantage parameters of the dimension in one resume are compared according to the level weight of the same dimension in the resume, and compared with the level weight of the same dimension in other resume, the level weight presents the adaptation degree of the content of the dimension in the resume and the recruitment content of the enterprise, so that the advantage of the dimension in the resume is presented more accurately.
Preferably, the method for acquiring the same-dimension dominant parameter comprises the following steps: and taking the difference value between the hierarchical weight of any dimension of each piece of human resource information and the average value of the hierarchical weights of the same dimension of other human resource information as the same-dimension dominant parameter of the corresponding dimension of each piece of human resource information.
The calculation formula of the same-dimensional dominance parameter of the m dimension of the nth resume is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,the same-dimensional dominance parameter of the m dimension of the nth resume; />The hierarchy weight of the m dimension of the nth resume; />The hierarchy weight of the mth dimension of the jth resume, wherein +.>And->The method comprises the steps of carrying out a first treatment on the surface of the n is the total number of resume acquired by the company in one day.
It should be noted that, if the capability of a certain dimension in a resume is very prominent, the analytic hierarchy process is used to obtain a larger hierarchical weight of the dimension, and to compare the dominant situation of the same dimension of multiple resume, calculate the average value of the hierarchical weights of the same dimension of other resume, when the information of a certain dimension of the nth resume is compared with the information of the corresponding dimension of other resumeMore prominent, thenThe positive and larger; when one dimension of the nth resume is not highlighted by the information of the corresponding dimension of the other resume, the corresponding dimension of the nth resume is +. >The negative and smaller.
And obtaining the same-dimensional dominant parameters of each dimension of each resume according to the calculation method of the same-dimensional dominant parameters of the m dimension of the nth resume.
So far, each dimension of each resume has corresponding homodimensional dominant parameters.
The resource information management module 104 is configured to perform cluster management on the human resource information by combining the comprehensive advantage parameter of any dimension of each piece of human resource information, the same-dimension advantage parameter and the dimension weight.
Based on comprehensive advantage parameters obtained by the mutual influence condition of repeated contents among different dimensions in the same resume, comprehensive capacity of each dimension in the resume can be measured, the same-dimension advantage parameters represent importance degrees of the same dimension in different resume, the dimension weight measures importance degrees of each dimension information in the resume for enterprise recruitment, any dimension in the resume is analyzed by combining three factors, and the final dimension value represents importance degrees of dimension information suitable for enterprise requirements.
Preferably, the method for acquiring the final dimension value of each dimension of each resume is as follows: respectively carrying out normalization processing on the comprehensive advantage parameter and the same-dimension advantage parameter to sequentially obtain a normalized comprehensive advantage parameter and a normalized single-dimension advantage parameter; taking a preset first adjustment coefficient as the weight of the normalized comprehensive advantage parameter, taking a preset second adjustment coefficient as the weight of the normalized single-dimensional advantage parameter, and carrying out weighted summation on the normalized comprehensive advantage parameter and the normalized single-dimensional advantage parameter of each dimension of each piece of human resource information to obtain the weighted influence parameter of each dimension of each piece of human resource information; and multiplying the weighted influence parameter of each dimension of each piece of human resource information by the dimension weight of the corresponding dimension to obtain the final dimension value of each dimension of each piece of human resource information.
It should be noted that the first adjustment parameter is presetAnd preset a second adjustment parameter->The sum is an integer 1, wherein,,/>. Presetting a first adjusting parameter->And preset a second adjustment parameter->The specific size of the (C) is regulated according to the practical conditions of the enterprise, and the specific regulation method comprises the following steps: if the enterprise pays attention to the comprehensive ability, a first adjustment parameter is preset +.>Larger, preset second adjustment parameter +.>Smaller; if the enterprise pays attention to the single domain prominence capability, presetting a first adjustment parameter +.>Smaller, preset second adjustment parameter +.>Larger; if enterprises pay attention to comprehensive capacity and single-domain prominence capacity. The first adjustment parameter is preset>And preset a second adjustment parameter->Both 0.5. In the embodiment of the invention, the first adjustment parameter +.>And preset a second adjustment parameter->Experience values of 0.5 and 0.5 are sequentially taken. In the embodiment of the invention, normalization functions are used for respectively normalizing the comprehensive advantage parameter and the same-dimensional advantage parameter, and other normalization methods for normalizing the comprehensive advantage parameter and the same-dimensional advantage parameter, such as normalization methods of function transformation, maximum and minimum normalization and the like, can be selected in the embodiment of the invention, and the method is not limited.
And adjusting the hierarchy weight by combining the comprehensive advantage parameter and the same-dimensional advantage parameter of each dimension of each resume to obtain a final dimension value of each dimension of each resume. The calculation formula of the final dimension value is as follows:
in the method, in the process of the invention,the final dimension value of the m dimension of the nth resume; />The comprehensive dominance parameter of the m dimension of the nth resume; />The same-dimension dominant parameter of the m dimension of the nth resume; />The dimension weight of the m dimension of the nth resume; />Presetting a first adjustment coefficient; />Presetting a second adjustment coefficient;weighting influence parameters of the m dimension of the nth resume; norms are normalization functions.
It should be noted that, according to the comprehensive capability and single domain advantage of the enterprise, the preset first adjustment coefficient and the preset second adjustment coefficient are adjusted, and the comprehensive advantage parameter is normalizedAnd normalized one-dimensional dominance parameterWeighting and summing to enable weighting influence parameters of corresponding dimensions in the resume to present recruitment requirements of enterprises; and adjusting the dimension weight of the dimension in the resume based on the weighted influence parameter, so that the final dimension value of the dimension representing the resume better meets the recruitment requirement of the enterprise.
The final dimension value presents the importance degree of dimension information suitable for enterprise requirements, the comprehensive advantage and the single advantage of each dimension in the resume are considered based on the clustering parameters acquired by the final dimension value, and the accuracy of the clustering parameters for representing the resume features is improved. The acquisition method of the clustering parameters comprises the following steps: and accumulating final dimension values of each dimension of each piece of human resource information to obtain clustering parameters of the corresponding human resource information. The calculation formula of the clustering parameters of each resume is as follows:
in the method, in the process of the invention,the clustering parameter is the n-th resume; />Final dimension value of the m dimension of the nth resume; m is the dimension of each resumeNumber of the pieces.
It should be noted that, when each dimension in the same resume corresponds to a keyword, the larger the final dimension value of each dimension is, the more obvious the clustering feature of the resume is described, and each dimension of the resume has the advantage that the more information amount carried by the resume is.
Clustering the clustering parameters by using a clustering algorithm to obtain a preset number of clustering clusters; and displaying the human resource information corresponding to the aggregation parameters in different clusters. In the embodiment of the invention, the clustering parameters are clustered by using a K-means clustering algorithm, and K takes an empirical value of 3, so that an implementer can set the clustering parameters according to actual conditions; the clustering parameters are divided into 3 categories, namely, the resume corresponding to the clustering parameters in each clustering cluster is divided into one category, and the resume of each category is transmitted to a display module in the human resource information management system for display.
The K-means clustering algorithm is a well-known technique for those skilled in the art, and will not be described herein.
The present invention has been completed.
In summary, in the embodiment of the present invention, the data acquisition module is configured to acquire a keyword area of each dimension of the human resource information, and set a corresponding dimension weight; the comprehensive advantage analysis module is used for acquiring non-repeated texts in the keyword area and acquiring comprehensive advantage parameters of each dimension in each piece of human resource information; the dimension advantage analysis module is used for acquiring the same-dimension advantage parameters of each dimension according to the hierarchy weights of the same dimension of the human resource information; and clustering management is carried out on the human resource information by combining comprehensive advantage parameters of any dimension in each piece of human resource information, the same-dimension advantage parameters and the dimension weight. According to the invention, text information of different dimensionalities of the human resource information and the advantages of the same dimensionality are analyzed, so that the human resource information is clustered more accurately, and the management efficiency of the human resource information is improved.
It should be noted that: the sequence of the embodiments of the present invention is only for description, and does not represent the advantages and disadvantages of the embodiments. The processes depicted in the accompanying drawings do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments.
The foregoing description of the preferred embodiments of the present invention is not intended to be limiting, but rather, any modifications, equivalents, improvements, etc. that fall within the principles of the present invention are intended to be included within the scope of the present invention.

Claims (4)

1. A human resource information management system based on data clustering, the system comprising:
the data acquisition module is used for dividing each piece of human resource information into at least two dimensions according to different types of preset keywords, and one dimension corresponds to one type of preset keyword; setting dimension weights of all dimensions of each piece of human resource information; acquiring a keyword area of each dimension according to the position of a preset keyword in the human resource information;
the comprehensive advantage analysis module is used for acquiring a text which is not repeated in the keyword area; acquiring comprehensive advantage parameters of each dimension in each piece of human resource information according to the distribution confusion condition of non-repeated texts in a keyword area of any dimension and other arbitrary dimensions and the difference of the number of the non-repeated texts;
The dimension advantage analysis module is used for acquiring the hierarchical weight of each dimension of each piece of human resource information; according to the hierarchical weight of any one dimension of any one piece of human resource information and the hierarchical weight of the corresponding dimension of other human resource information, the same-dimension advantage parameter of each dimension of each piece of human resource information is obtained;
the resource information management module is used for carrying out cluster management on the human resource information by combining the comprehensive advantage parameter of any dimension of each human resource information, the same-dimension advantage parameter and the dimension weight;
the keyword region acquisition method comprises the following steps:
traversing all texts in each piece of human resource information; if the same text does not exist in the human resource information in each type of preset keywords, the text does not exist in the keyword area of the corresponding dimension of the preset keywords of the corresponding type, and the tag value of the corresponding dimension is set to be a first tag value;
if the same text exists in the human resource information in each type of preset keywords, the text is used as the keyword text of the corresponding dimension of the preset keywords of the corresponding type in the human resource information, and the label value of the corresponding dimension of the preset keywords of the corresponding type is set as a second label value; in each piece of human resource information, all texts in the area from the first keyword text in the front keyword text in the two adjacent keyword texts to the first keyword text in the rear keyword text are used as keyword areas of corresponding dimensions of the front keyword text; all texts in the region from the first keyword text to the end of the human resource information in the last keyword text are used as the keyword region of the corresponding dimension of the last keyword text;
The method for acquiring the comprehensive advantage parameters comprises the following steps:
substituting the probability of each non-repeated text in the keyword area of one dimension and the keyword area of the other dimension into an information entropy formula for any two dimensions of each piece of human resource information, and sequentially obtaining the text confusion degree of the corresponding two dimensions; substituting the probability of each non-repeated text in the two keyword areas corresponding to any two dimensions into an information entropy formula to obtain the text joint confusion corresponding to the two dimensions; taking the ratio of the sum of the text confusion corresponding to any two dimensions to the text joint confusion as a joint influence parameter corresponding to the two dimensions;
counting the number of the non-repeated texts in each keyword area;
taking the sum of the numbers of the non-repeated texts in the keyword areas of any two dimensions as a molecule, and taking the ratio obtained by taking the sum of the absolute value of the difference of the numbers and a preset constant as a denominator as a quantity adjustment value of the corresponding two dimensions; the quantity adjustment values of any two dimensions are used as weights of the joint influence parameters to be adjusted, so that quantity influence parameters corresponding to the two dimensions are obtained;
When the label value of the dimension of the human resource information is a first label value, the comprehensive advantage parameter of the corresponding dimension is 0; carrying out negative correlation and normalized mapping on the average value of the quantity influence parameters of any dimension with the label value being the second label value and any other dimension to obtain the comprehensive advantage parameters of the corresponding dimension;
the method for acquiring the same-dimensional dominant parameters comprises the following steps:
and taking the difference value between the hierarchical weight of any one dimension of each piece of human resource information and the average value of the hierarchical weights of the same dimension of other human resource information as the same-dimension advantage parameter of the corresponding dimension of each piece of human resource information.
2. The human resource information management system based on data clustering according to claim 1, wherein the method for clustering management of human resource information by combining the comprehensive dominance parameter, the homodimensional dominance parameter and the dimensional weight of any dimension of each piece of human resource information is as follows:
respectively carrying out normalization processing on the comprehensive advantage parameter and the same-dimensional advantage parameter to sequentially obtain a normalized comprehensive advantage parameter and a normalized single-dimensional advantage parameter;
Taking a preset first adjustment coefficient as the weight of the normalized comprehensive advantage parameter, taking a preset second adjustment coefficient as the weight of the normalized single-dimensional advantage parameter, and carrying out weighted summation on the normalized comprehensive advantage parameter and the normalized single-dimensional advantage parameter of each dimension of each piece of human resource information to obtain a weighted influence parameter of each dimension of each piece of human resource information; obtaining a final dimension value of each dimension of each piece of human resource information by multiplying the weighted influence parameter of each dimension of each piece of human resource information by the dimension weight of the corresponding dimension;
accumulating the final dimension values of each dimension of each piece of human resource information to obtain clustering parameters of the corresponding human resource information;
clustering the clustering parameters by using a clustering algorithm to obtain a preset number of clustering clusters; and displaying the human resource information corresponding to the clustering parameters in different clusters.
3. The human resource information management system based on data clustering as claimed in claim 1, wherein the non-repeated text obtaining method comprises:
and removing repeated texts in each keyword area by using a text de-duplication algorithm to obtain non-repeated texts in the corresponding keyword area.
4. The human resource information management system based on data clustering as claimed in claim 1, wherein the hierarchical weight acquisition method comprises:
when the label value of the dimension of the human resource information is the first label value, the hierarchical weight of the corresponding dimension is 0;
and analyzing texts in the keyword areas of any same dimension of all the human resource information by using a hierarchical analysis method based on preset screening content of each dimension to obtain the hierarchical weight of each dimension of each piece of human resource information.
CN202310933469.XA 2023-07-27 2023-07-27 Human resource information management system based on data clustering Active CN116644184B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310933469.XA CN116644184B (en) 2023-07-27 2023-07-27 Human resource information management system based on data clustering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310933469.XA CN116644184B (en) 2023-07-27 2023-07-27 Human resource information management system based on data clustering

Publications (2)

Publication Number Publication Date
CN116644184A CN116644184A (en) 2023-08-25
CN116644184B true CN116644184B (en) 2023-10-20

Family

ID=87623379

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310933469.XA Active CN116644184B (en) 2023-07-27 2023-07-27 Human resource information management system based on data clustering

Country Status (1)

Country Link
CN (1) CN116644184B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116934284B (en) * 2023-09-15 2023-12-12 济南市人力资源社会保障智慧服务中心 Human resource data management method based on big data
CN117828002B (en) * 2024-03-04 2024-05-10 济宁蜗牛软件科技有限公司 Intelligent management method and system for land resource information data

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107766323A (en) * 2017-09-06 2018-03-06 淮阴工学院 A kind of text feature based on mutual information and correlation rule
CN110580286A (en) * 2019-08-09 2019-12-17 中山大学 Text feature selection method based on inter-class information entropy
CN110610446A (en) * 2019-07-25 2019-12-24 东南大学 County town classification method based on two-step clustering thought
CN110659801A (en) * 2019-08-19 2020-01-07 万马奔腾(上海)数据有限公司 Talent evaluation system based on big data algorithm
CN111125086A (en) * 2018-10-31 2020-05-08 北京国双科技有限公司 Method, device, storage medium and processor for acquiring data resources
CN111461637A (en) * 2020-02-28 2020-07-28 平安国际智慧城市科技股份有限公司 Resume screening method and device, computer equipment and storage medium
CN112001438A (en) * 2020-08-19 2020-11-27 四川大学 Multi-mode data clustering method for automatically selecting clustering number
CN112162972A (en) * 2020-05-06 2021-01-01 西安电子科技大学 Human resource bidirectional recommendation system based on data mining and privacy protection technology
CN115879901A (en) * 2023-02-22 2023-03-31 陕西湘秦衡兴科技集团股份有限公司 Intelligent personnel self-service platform

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080077570A1 (en) * 2004-10-25 2008-03-27 Infovell, Inc. Full Text Query and Search Systems and Method of Use

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107766323A (en) * 2017-09-06 2018-03-06 淮阴工学院 A kind of text feature based on mutual information and correlation rule
CN111125086A (en) * 2018-10-31 2020-05-08 北京国双科技有限公司 Method, device, storage medium and processor for acquiring data resources
CN110610446A (en) * 2019-07-25 2019-12-24 东南大学 County town classification method based on two-step clustering thought
CN110580286A (en) * 2019-08-09 2019-12-17 中山大学 Text feature selection method based on inter-class information entropy
CN110659801A (en) * 2019-08-19 2020-01-07 万马奔腾(上海)数据有限公司 Talent evaluation system based on big data algorithm
CN111461637A (en) * 2020-02-28 2020-07-28 平安国际智慧城市科技股份有限公司 Resume screening method and device, computer equipment and storage medium
WO2021169111A1 (en) * 2020-02-28 2021-09-02 平安国际智慧城市科技股份有限公司 Resume screening method and apparatus, computer device and storage medium
CN112162972A (en) * 2020-05-06 2021-01-01 西安电子科技大学 Human resource bidirectional recommendation system based on data mining and privacy protection technology
CN112001438A (en) * 2020-08-19 2020-11-27 四川大学 Multi-mode data clustering method for automatically selecting clustering number
CN115879901A (en) * 2023-02-22 2023-03-31 陕西湘秦衡兴科技集团股份有限公司 Intelligent personnel self-service platform

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Analysis of absence seizure EEG via Permutation Entropy spatio-temporal clustering;Nadia Mammone,等;《2011 IEEE International Symposium on Medical Measurements and Applications》;第532-535页 *
一种基于自适应关联熵的关键字提取算法;罗有志;陈征明;陈明;梅文涛;;计算机与现代化(第04期);第71-75页 *

Also Published As

Publication number Publication date
CN116644184A (en) 2023-08-25

Similar Documents

Publication Publication Date Title
CN116644184B (en) Human resource information management system based on data clustering
US9087306B2 (en) Computer-implemented systems and methods for time series exploration
CN111324642A (en) Model algorithm type selection and evaluation method for power grid big data analysis
US20160350294A1 (en) Method and system for peer detection
KR101681109B1 (en) An automatic method for classifying documents by using presentative words and similarity
US20140019448A1 (en) Computer-Implemented Systems and Methods for Efficient Structuring of Time Series Data
CN103617435B (en) Image sorting method and system for active learning
CN104077407B (en) A kind of intelligent data search system and method
CN111680225A (en) WeChat financial message analysis method and system based on machine learning
WO2021128523A1 (en) Technology readiness level determination method and system based on science and technology big data
CN116739541B (en) Intelligent talent matching method and system based on AI technology
CN116109195B (en) Performance evaluation method and system based on graph convolution neural network
CN111078859B (en) Author recommendation method based on reference times
CN114912798A (en) Earthquake loss evaluation system based on random forest and earthquake damage big data
CN104599062A (en) Classification based value evaluation method and system for agricultural scientific and technological achievements
CN109582743A (en) A kind of data digging method for the attack of terrorism
CN112151185A (en) Child respiratory disease and environment data correlation analysis method and system
Sumangali et al. Determination of interesting rules in FCA using information gain
Machado et al. Ranking the scientific output of researchers in fractional calculus
CN116127194A (en) Enterprise recommendation method
CN115879901A (en) Intelligent personnel self-service platform
CN115392351A (en) Risk user identification method and device, electronic equipment and storage medium
CN108197729A (en) Value investment analysis method, equipment and storage medium based on machine learning
CN113988149A (en) Service clustering method based on particle swarm fuzzy clustering
CN112818215A (en) Product data processing method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CB03 Change of inventor or designer information

Inventor after: Zhang Ling

Inventor after: Pan Shuojie

Inventor after: Zhu Tiandian

Inventor before: Zhu Tiandian

CB03 Change of inventor or designer information