WO2016045153A1 - 基于文本履历信息的信息可视化方法及智能可视分析系统 - Google Patents
基于文本履历信息的信息可视化方法及智能可视分析系统 Download PDFInfo
- Publication number
- WO2016045153A1 WO2016045153A1 PCT/CN2014/088601 CN2014088601W WO2016045153A1 WO 2016045153 A1 WO2016045153 A1 WO 2016045153A1 CN 2014088601 W CN2014088601 W CN 2014088601W WO 2016045153 A1 WO2016045153 A1 WO 2016045153A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- history
- growth
- information
- resume
- experience
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation; Time management
- G06Q10/105—Human resources
- G06Q10/1053—Employment or hiring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/103—Formatting, i.e. changing of presentation of documents
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/103—Formatting, i.e. changing of presentation of documents
- G06F40/117—Tagging; Marking up; Designating a block; Setting of attributes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/137—Hierarchical processing, e.g. outlines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/242—Dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
Definitions
- the invention belongs to the technical field of computer application, and relates to an intelligent visual analysis system and an information visualization method based on text history information.
- the resume information is a kind of information summarizing the personal experience. It exists in the resume data, and mainly includes the basic information of the individual and a brief description of the personal experience data.
- Basic personal information includes name, gender, date of birth, ethnicity, education, political outlook, religious beliefs, major family members, major social relationships, marriage and personal health.
- Personal experience as an important part of the resume usually includes the individual's past learning experience, job experience and so on.
- Personal resume data is an important basis for personnel assessment. It reflects the individual's past behavior and current ability in many aspects.
- the curriculum vitae analysis predicts future behavior based on past behaviors of the personnel reflected in the historical data, and is widely used in personnel selection and recruitment of various enterprises and institutions, cadre assessment and management of government departments, and research and evaluation of scientific and technological talent mobility.
- the electronic resume is mainly divided into: 1) the public resume existing on the Internet; 2 the non-public resume existing in each enterprise and the talent recruitment system.
- the electronic history can be divided into two types: a structured history and an unstructured history: 1 structured history.
- a structured history usually in the form of a table, it comes from the personnel recruitment system or the internal management system. Its resume structure is more standardized and fixed, which is convenient for unified management.
- structured resumes are difficult to perform deep semantic-based analysis.
- 2 Unstructured resume It is usually in the form of text and has a wide range of sources, such as major news sites or social media on the Internet. Unstructured resumes are diverse in structure and are not convenient for unified analysis and management. However, unstructured resumes are based on text as a carrier, which often contains rich semantic information, so they can perform semantic-based intelligent analysis, such as semantic search and classification.
- CVAS Curriculum Vitae Analysis System
- CVAS mainly conducts automated history analysis and management of structured history data. With its powerful processing and analysis capabilities, it can quickly filter non-compliant resumes based on historical data, greatly improving the efficiency of resume analysis. Moreover, it can also quantitatively analyze and scientifically evaluate the history data according to specific application requirements, making the history analysis structure more reasonable and reliable. Therefore, in recent years, CVAS has been increasingly valued by the personnel management departments of enterprises and institutions, and is widely used for personnel selection and other personnel matters. Source management activities.
- the existing CVAS still has the following shortcomings: (1) The current system is not suitable for the analysis of unstructured resume data. Unstructured resumes are usually stored in plain text (eg txt, word, pdf, etc.). The format is not uniform and varies greatly, and it is difficult to apply directly to the current CVAS. In other words, current CVAS lack the ability to translate unstructured resumes into structured resumes. (2) The analysis ability of the current system is mainly reflected in qualitative analysis and quantitative calculation (such as resume screening and scoring) under simple rules and statistical management (such as generating resume information report), while ignoring the potential patterns contained in the resume.
- Intelligent mining and intuitive visual analysis can not help users to complete some complex tasks, such as semantic-based resume search and classification, personnel appointment and recommendation , career planning, etc.
- the current system only performs an isolated analysis for a single resume, ignoring the correlation between resumes.
- the potential associations between resumes can reflect potential social relationships between people, which are generated by the intersection of individual experiences, such as classmates, colleagues, fellow, comrades, collaborators, competitors, and so on. Based on this relationship, it is possible to restore and construct a potential social network between people, which can play a deep understanding of the scientific management of the resume, the potential social association between the users, and the organizational hierarchy between the people. enhancement.
- the problem solved by the technology of the present invention overcomes the deficiencies of the existing methods and systems, and provides an intelligent visual analysis system and information visualization method based on text history information, which fully utilizes potential mode information in the history data, based on natural language processing and data.
- Mining, machine learning, and information visualization technologies build a visual information visual analysis environment that helps users understand potential growth patterns in their resumes and potential associations between resumes, enabling semantic-based resume search and classification, appointment and appointment, and career planning. And support for tasks such as interpersonal relationships.
- the inventive technology is a general framework for discovering the potential growth patterns contained in the history data and the potential social relationships between people, and expressing these pattern features and social relationships in an intuitive visual way. It can be widely used in the field of intelligent mining and information visualization of staff resumes, cadre resumes, corporate executive resumes, and researcher resumes.
- an intelligent visual analysis system based on text history information comprising: a text resume preprocessing module; a personal growth experience quantification module; a personal growth pattern mining module; a group potential social relationship mining module; an organization generation module ; resume information visualization module; resume visual analysis module. among them:
- Text resume preprocessing module preprocesses the unstructured text history data and extracts the history information. Effective elements (including personal basic information and experience information), and structured metadata elements (Extensible Markup Language).
- Effective elements including personal basic information and experience information
- structured metadata elements Extensible Markup Language.
- the module uses natural language processing technology to convert multi-source history texts with non-uniform format into resume element data with unified structure, which provides a data foundation for the processing of subsequent modules.
- the module performs quantitative calculation of the experience level for the experience information in the history element, thereby obtaining the growth trajectory sequence data.
- the module uses natural language processing technology to quantify the experience information in the curriculum elements into grade information, which provides a basis for the mining and visualization of subsequent modules.
- the module uses machine learning and data mining technology to analyze the time dimension and spatial dimension of the growth trajectory sequence data, and obtain the time and space growth mode of the resume.
- Group potential social relationship mining module uses the association algorithm in data mining to correlate the growth trajectory sequence data of multiple resumes, and obtain potential social relationships between resumes (such as classmates, colleagues, fellow villagers, comrades, collaborators, competitors, etc.).
- the module is based on the potential social relationships of the groups represented by the plurality of resumes, and can extract and restore the hierarchical information of the organization from the unit intersection information of the group.
- History information visualization module Based on a text visualization method based on text history information, the module converts the aforementioned history of the history of the trajectory sequence and the mining results output by each mining module into an intuitive and easy-to-understand information visualization. The generated visualizations help users quickly grasp the characteristics of the resume data and the knowledge it contains.
- Resume visual analysis module builds a visual information visual analysis environment based on the information visualization map, and uses human-computer interaction technology to help the user understand the potential information and pattern features in the history from the time and space dimensions, thereby obtaining deep knowledge.
- the algorithm is based on the idea of growth metaphor, and transforms the abstract growth information in the resume into the visual representation of the space-time trajectory.
- the time-space trajectory visualization generated by the algorithm can visually express the original abstract personal growth information in a time-space diagram by visualizing the growth trajectory sequence data.
- Resume the potential social network visualization algorithm.
- the algorithm builds a visual representation of the social network of the resume based on the potential social relationships between the resumes. Based on the potential relationship between the resumes, the algorithm constructs a visual representation of the social network of the resume.
- the generated potential relationship map can visually express the potential relationship between the original abstract resumes in the form of a network diagram.
- Resume organization level visualization algorithm The algorithm is based on the potential social relationships between resumes, and constructs the organizational level visual representation of the unit in which the person is located.
- the algorithm extracts the unit intersection information between the resumes from the history information, converts the history with the unit intersection into the organizational hierarchy of the corresponding unit, and organizes the relationship with the structure based on the table structure. The way the organization chart is visualized.
- the present invention uses the history data in the form of unstructured text as the data source, and based on the natural language processing technology, the unified processing requirement of the multi-source heterogeneous history data is satisfied by the history structured element extraction mechanism. Enhance the scope of application of the system and method.
- the present invention focuses on intelligent mining of potential mode information contained in the history data, and performs deep visual analysis on the history mode information, and can obtain the growth trajectory pattern and growth in the history data.
- the category mode enables support for deep analysis tasks such as semantic-based resume search and classification, personnel assessment, and appointment and dismissal recommendations.
- the present invention innovatively introduces the potential association between the resumes into the analysis process, and through the mining and information visualization technology, the potential social relationship between the persons represented by the resume can be obtained. Based on this potential relationship, a potential social network between people can be constructed. Based on the social network, the organizational hierarchy relationship between the people can be restored, so that the model features reflected by the large number of resumes are provided to the user in a macroscopic perspective, thereby obtaining a deep understanding of the social relationship of the group.
- 1 is a block diagram of the constituent modules of the present invention.
- FIG. 1 system architecture diagram.
- Fig. 3 is a diagram showing an example of the definition of the growth time trajectory category of the history time dimension, wherein: (a) the graph is a growth trajectory map, (b) the graph is a robust trajectory map, (c) the graph is a volatility trajectory map, and (d) the graph is Declining trajectory map.
- the solid line in each figure is the personal growth trajectory, and the dotted line is the average value of the growth trajectory of the overall sample.
- Fig. 4 is a diagram showing an example of definition of a history space dimension growth trajectory category, wherein: (a) the picture is a "place ⁇ central” type trajectory map, (b) a picture is a "local ⁇ central ⁇ local” type trajectory map, and (c) a diagram is The "central ⁇ local” type trajectory map, and (c) the map is the "central ⁇ local ⁇ local ⁇ central” type trajectory map.
- Figure 5 is a schematic diagram showing the results of classification of personal growth trajectories.
- FIG. 6 is a schematic diagram showing the results of group potential relationship mining, wherein: (a) the graph shows the growth trajectory similarity relationship diagram, and (b) the graph shows the experienced intersection relationship graph.
- Figure 7 is a graph of personal growth, in which: (a) is a growth trajectory of the time dimension, and (b) is a growth trajectory of the spatial dimension.
- Figure 8 is a potential relationship diagram.
- Figure 9 is a diagram of the organization.
- FIG. 10 is a schematic diagram of statistical analysis of information of a history track.
- FIG. 11 is a schematic diagram of spatio-temporal association interaction analysis of a history track.
- (a) is a time trajectory map
- (b) is a spatial trajectory map
- the experience segment shown by the broken line frame in (a) corresponds to the growth trajectory indicated by the dotted arrow in (b).
- Fig. 12 is a schematic diagram showing the mode visual analysis of the history space-time trajectory.
- the picture shows the “growth period”, “bottleneck period” and “breakthrough period” modes reflected in the personal growth process.
- the “growth period” represents the rapid promotion of the early stage of life
- the “bottleneck period” represents a bottleneck in the middle of the career, and the promotion is slower
- the “breakthrough period” represents the breakthrough bottleneck at the end of the career and continues to advance.
- FIG. 13 is a schematic diagram of interactive visual analysis of a resume social network.
- (a) is a time trajectory map
- (b) is a spatial trajectory map
- (c) is a social network map.
- the dashed box in (a) corresponds to the dashed box in (b) in the space-time dimension, and the specific information of the history of the intersection is shown in (c).
- Personnel The main body represented by the resume, such as employees of enterprises and institutions, government departmental cadres, corporate executives and scientific research personnel.
- System user usually a decision maker, such as a leader and other management personnel.
- the invention is based on natural language processing, data mining, machine learning and information visualization technology, and constructs a visual information visual analysis environment, which can fully utilize the information in the text history data, and extract the potential knowledge in the history information that plays an important role in decision making. And the potential knowledge is displayed in an intuitive visualization based on the growth metaphor, which helps the user to understand the potential pattern features and potential related information between the resumes, thus the fuzzy search and intelligent classification, automatic personnel appointment and dismissal, career Support is provided for tasks such as planning and interpersonal relationships.
- the present invention includes: a text history pre-processing module, a personal growth experience quantification module, a personal growth mode mining module, a group potential social relationship mining module, an organization generation module, a history information visualization module, and a resume visual analysis module.
- a text history pre-processing module As shown in FIG. 1 , the present invention includes: a text history pre-processing module, a personal growth experience quantification module, a personal growth mode mining module, a group potential social relationship mining module, an organization generation module, a history information visualization module, and a resume visual analysis module.
- FIG. 2 The system architecture diagram of the present invention is shown in FIG. 2. among them:
- the module preprocesses unstructured resume text data through format filtering, Chinese word segmentation, and named entities.
- the natural language processing technique such as recognition extracts the effective elements in the history information, and obtains structured history element XML data (Extensible Markup Language).
- the XML data format is designed according to the characteristics of the history data.
- the XML data is hierarchical and its structure is as follows.
- the XML data contains two parts of the resume elements: the resume basic information and the experience information table.
- the basic information of the resume includes basic information such as name, gender, ethnicity, and place of birth;
- the experience information table is a table structure, and the header includes fields such as start time, termination time, place, unit, position, etc.
- Each record in the table represents a person.
- An experiential element is the experience (employment or learning) of the person within a certain period of time.
- Unstructured resume text data mainly includes text history (html format) from the Internet, text history (txt, word, pdf, etc.) from the personnel system, and other personnel file history (stored in the database).
- the Internet text resume is as follows. This data is usually obtained by the web crawler from the Internet. Because its format is complex and not uniform, the preprocessing for it is also the most complicated.
- the module specifically includes the following steps:
- the html analysis algorithm is used to remove noise such as advertisements and html formats from the original resume text, and a pure resume text including history information is obtained.
- the pure resume text data is as follows. The data consists of two parts of text segments: a basic information segment and an experiential information segment. It should be noted that this step is only for Internet text history data.
- the hierarchical structure organizes the history information according to the basic information segment (basic_info) and the experience information segment (office_record_array).
- the basic information segment holds the basic information of the resume, and its structure is a fixed list form.
- the experiential information segment is designed as a tree structure, and the tree nodes are different experience segments (office_record).
- the tree structure has good scalability and can be easily and quickly expanded and queried. This structure can significantly improve the efficiency of feature matching calculation for large-scale history data.
- the history element extraction algorithm mentioned in step 2 is the core algorithm of the module, and the regular expression matching method is mainly used to extract each element.
- the algorithm specifically includes the following steps:
- the "time” and “location” elements are extracted by regular matching.
- the "time” element is extracted as the keyword of the “year” as the regular match
- the "place” element is extracted as the keyword of the regular match with "province", "city”, “county”, “township”, etc.;
- Each line element in the unit keyword dictionary includes two parts: "keyword” (and “auxiliary keyword”.
- "auxiliary keyword” includes two types of R type and L type, and multiple "auxiliary keywords” are separated by commas.
- the principle of unit key recognition using the unit keyword dictionary is: when a certain "keyword” in the dictionary is recognized, and there is no R-type "auxiliary keyword” on the right side, and there is no L-type "auxiliary keyword” on the left side. At the time, the recognition is successful; otherwise, the recognition fails.
- the fourth row of the table 1 element represents the keyword "part", and its R type “auxiliary keyword” is "long” And “team”, its L-type “auxiliary keyword” is "dry”.
- the module obtains the growth trajectory sequence data from the history element XML data.
- the elements in the sequence data are a six-tuple, that is, ⁇ start time, end time, location, unit, position, quantization level>, wherein the last field "quantization level” represents the experience segment. Level size.
- the core algorithm of this module is the experiential level quantization recognition algorithm.
- the algorithm specifically includes the following steps:
- the history information table for each text history information is sorted in ascending order according to the "start time” field to obtain an ordered experience information table.
- step 2 Repeat step 2 until the ordered information table is scanned and processed.
- the set of experiential segments containing different magnitudes is composed into an ordered sequence to obtain the growth trajectory sequence data (see Table 2).
- the quantification library is a dictionary structure, and the elements in the dictionary are ⁇ unit, job title, quantization level> triplet.
- the dictionary is used as the basis for the quantitative module of personal growth experience and is constructed by human-computer interaction:
- the text history pre-processing module can be extracted from the resume corpus, and the user can also add and modify it.
- the quantization initial value is first calculated by the computer according to a certain level of quantization rule, and secondly, the user can process some special cases (see the special case explanation below) according to his own knowledge and experience. The correctness of the adjusted quantized value is guaranteed.
- step 2-2 The level quantization rule mentioned in step 2-2 depends on the specific application scenario:
- the quantitative level of the cadre can be divided into: national level (quantified to 5), provincial level (quantified to 4), and departmental level (quantified as 3), county level (quantified to 2), township level (quantified to 1) and other levels, of which each level can be further subdivided according to its deputy.
- the quantitative level of scientific research personnel can be divided into: academician (quantified to 5), positive researcher (quantified to 4), associate researcher (quantified to 3), Assistant researcher (quantified to 2), internship researcher (quantified to 1) and other levels.
- the computer can calculate its level according to the position field of “XX Mayor” (the quantified is 3), which is correct in general; however, if the job field is “Beijing Mayor”, “ The mayor of Shanghai Municipality, such as the municipal governor, should be quantified as a provincial-level (quantified to 4) according to its administrative particularity.
- the growth mode classification algorithm in this module innovatively applies supervised machine learning classification algorithms (such as Na ⁇ ve Bayes, SVM (Support Vector Machine) and other algorithms) to the history data.
- the unknown history can be automatically classified based on the growth pattern of the known history, and the user can quickly grasp the growth type to which the history belongs, and predict the future development trend of the history based on the growth mode.
- the algorithm specifically includes the following steps:
- the definition of the four personal growth trajectory types is relative to the average value of the overall sample (see the dotted line in Fig. 3).
- the personal growth rate (curve slope in Figure 3) can be obtained by measuring the time span experienced by each level in the personal growth trajectory.
- the growth rate of the growth type is significantly larger than the sample average over the entire time dimension; the steady growth rate is approximately equal to the sample average; the growth rate of the wave type is greater than the sample average at some stages in the time dimension, and At other stages, it is smaller than the sample mean; the decaying growth rate is significantly smaller than the sample mean over the entire time dimension.
- the “features” here belong to the category of machine learning and data mining, which are used to describe different types of growth trajectory sequence data.
- the machine learning/data mining algorithm can only learn the type corresponding to the data/the mode of mining the data through the characteristics of the data. .
- the characteristics of the 1 time dimension It can be seen from the time dimension type described in step 1 that the growth rate of the growth trajectory sequence data can be used as its time dimension feature. This growth rate can be quantified as two types of features:
- Time tiers for each level representing the time span experienced by individuals at different levels.
- the formal expression is: " ⁇ quantization level 1, time span 1>, ⁇ quantization level 2, time span 2>, ..., ⁇ quantization level n, time span n>".
- n represents the sequence length of the growth trajectory sequence data (the number of elements in the sequence data)
- the time span can be obtained by subtracting the "termination time” and the "starting time” of each element in the sequence data.
- the time span characteristics of each level of the sequence data shown in Table 2 are: " ⁇ 0, 3>, ⁇ 1, 0>, ⁇ 2, 3>, ⁇ 3, 3>, ⁇ 4, 4>, ⁇ 5,8>, ⁇ 6,4>, ⁇ 7,0>, ⁇ 8,0>".
- Timing growth slope which represents the slope value of the growth trajectory of individuals at different time periods.
- the formal expression is: " ⁇ time phase 1, slope 1>, ⁇ time phase 2, slope 2>, ..., ⁇ time phase m, slope m>".
- the sequence data shown in Table 2 can be divided into 10 time phases, such as "1989.1.1-1991.6.1”, “1991.6.1-1994.1.1”, ..., “2011.6.1-2014.1.1”.
- the slope of the growth trajectory at each time phase is the difference between the quantization level at the end of the phase and the quantization level at the beginning of the phase, so the timing growth slope is characterized by: " ⁇ 1,0>, ⁇ 2,2>, ⁇ 3,1 >, ⁇ 4, 1>, ⁇ 5, 0>, ⁇ 6, 1>, ⁇ 7, 0>, ⁇ 8, 0>, ⁇ 9, 1>, ⁇ 10, 0>".
- time dimension features may be used alone or in combination in the machine learning process.
- the characteristics of the spatial dimension also known as the "spatial sequence”
- the spatial dimension type described in step 1 that the geographic location of the unit in which the individual is located can be used as the spatial dimension feature of the growth trajectory sequence data.
- the feature is formalized as: “ ⁇ place type 1, location type 2, ..., location type k>".
- the "place type” is characterized by the characteristic attributes such as "central” and "place” described in step 1, and k represents the number of place types of the "place” field in the growth trajectory sequence data.
- the spatial dimension of the sequence data shown in Table 2 is characterized by: “ ⁇ place, center>”.
- the features of the spatial dimension here are referred to as “sequences” in the mining of sequence patterns, and the spatial dimension growth type described in step 1 is the “sequence pattern” found from several "sequences”.
- sample data For the growth trajectory sequence data (referred to as "sample data") in the known history element XML data, according to the time dimension growth type definition described in step 1 and the time dimension type feature described in step 2, manual marking Its time dimension grows.
- the machine learning classifier is used for classification training, and the classifier model parameters are learned.
- sequence pattern corresponds to the growth type of the spatial dimension described in step 1, and the spatial dimension growth type can be manually marked.
- sequence pattern is considered to be a spatial sequence pattern that has not appeared in the sample data, and can be used as a new spatial dimension growth type, and its type definition can be given manually for future resume classification tasks. .
- Figure 5 is a schematic diagram of the classification results.
- the person A is a growth type
- the person B is a stable type
- the person C is a wave type.
- the social relationship mining algorithm in this module innovatively uses the growth trajectory distance measurement algorithm and the association rule algorithm to mine the potential social relationship R between resumes (such as classmates, colleagues, fellow villagers, comrades, collaborators, competitors, etc.) ).
- the algorithm specifically includes the following steps:
- the known history database M, M has a size of n, representing the number of all resumes.
- Each element M 1 to M n in M represents history element XML data of each history.
- the similarity sim(i, j) of the growth trajectory sequence data between any two of the history records M i and M j in M is measured by the cosine similarity algorithm, and the similarity matrix sim is obtained.
- the matching degree mch(i, j) between the arbitrary histories M i and M j in the M is measured by the history element matching degree algorithm to obtain the matching degree matrix mch.
- sim(i,j) If sim(i,j)>s0, then the growth trajectories of M i and M j are considered to be similar, and the larger sim(i,j) is, the more similar they are. In other words, the size of sim(i,j) can measure the strength of similarity. Where s 0 is a similarity threshold.
- FIG. 6 is a schematic diagram showing the results of potential relationship mining.
- the history matching algorithm elements mentioned in Step 3 the input of M i and M j, MCH output matching of M i and M j (i, j), M i with respect to the difference component elements of M j Err(i,j), and the history elements of M i and M j intersect with it(i,j).
- the algorithm specifically includes the following steps:
- C t represents the number of element comparisons between M i and M j : C r represents the same when M i is compared with M j elements The number of times the feature is.
- C t represents the number of element comparisons between M i and M j :
- C r represents the same when M i is compared with M j elements The number of times the feature is.
- a difference element component list err(i,j) whose elements are different from the history elements of M i and M j .
- a history element intersection list its(i,j) whose elements are the same history elements between M i and M j .
- the organization generation algorithm in the module innovatively extracts and restores the hierarchical relationship of the organization from the potential social relationships among the plurality of resumes, and provides a basis for the visualization algorithm of the subsequent organization chart.
- the algorithm specifically includes the following steps:
- R is output by the group potential social relationship mining module, and its size is n ⁇ n, wherein each element R 11 ⁇ R nn represents a potential social relationship between the resumes, and the matrix element R ij represents between the history M i and the resume M j .
- Potential social relationships are output by the group potential social relationship mining module, and its size is n ⁇ n, wherein each element R 11 ⁇ R nn represents a potential social relationship between the resumes, and the matrix element R ij represents between the history M i and the resume M j .
- the library is a list structure: ⁇ V 1 , V 2 ,..., V m >.
- the elements in the library are in a tree structure, the root node of the tree is the "organization name”, and the leaf node is "member information”.
- the specific structure of the elements in the library is as follows: ⁇ organization name, ⁇ member 1, job 1, incumbent>, ⁇ member 2, position 2, incumbent>, ..., ⁇ member m, job m, whether incumbent>>.
- step 4 Repeat step 4 until the R traversal is completed. At this point all elements in V are the required organizational information.
- the module expresses the history information to the user in an intuitive way for the user to view and help the user to correctly understand the history information.
- the module contains three visualization algorithms: the history and space trajectory visualization algorithm. Historical social network visualization algorithms, resume organization visualization algorithms. Based on the three algorithms, the following visualization maps can be generated: personal growth graph, potential relationship graph, and organization graph.
- the personal growth graph is based on the visualization of the time and space trajectory visualization algorithm.
- the algorithm utilizes the growth metaphor idea, and the generated time-space trajectory visualization map can visually express the original abstract personal growth information in the form of time-space map by visualizing the growth trajectory sequence data.
- the specific steps of the algorithm are as follows:
- the horizontal axis is the time axis, including the "age” and “age” display modes; the vertical axis is the grade axis, which represents the "quantization level” dimension of the growth trajectory sequence data (for example, the cadre, including “class”, “where Several levels, such as “level” and “office level”; for example, “internship researcher”, “assistant researcher”, “deputy researcher”, “positive researcher”, “academician”, etc.).
- the horizontal axis is the time axis, including the "age” and “age” display modes; the vertical axis is the spatial axis, and the two-dimensional map is used as the spatial reference system, representing the spatial dimensions such as "place” and "unit” of the growth trajectory sequence data. .
- a resume's growth trajectory sequence data consists of a series of experience segments, each of which represents the basic unit of the growth trajectory sequence data.
- Trajectory visualization of 1 time dimension The horizontal rectangular block with a fixed width, variable length, and color fill is used as its visual metaphor expression.
- the horizontal axis position of the rectangular block corresponds to the time axis, and its width represents the time interval of the experienced segment (the left side represents the "starting time” and the right side represents the “end time”).
- the vertical axis position of the rectangular block corresponds to the rank axis and represents the "quantization level" of the experienced segment.
- the rectangular blocks are connected by vertical lines according to the chronological order of the experienced segments, which constitutes a complete visual representation of the time dimension growth trajectory.
- the time dimension growth trajectory visualization of different resumes is distinguished by the fill color of the rectangular blocks it contains.
- the experiential segment is a variable radius, color-filled circle as its visual metaphor expression.
- the position of the circle is mapped to the two-dimensional map of the spatial axis, representing geographic information such as "place” and "unit” of the experience segment.
- the circles are connected by the variable-width, color-filled directed arrows in the chronological order of the experienced segments, which constitutes a complete visual representation of the spatial dimension growth trajectory, wherein the width of the directed arrow changes from the starting point to the ending point, representing The change in the "quantization level" between the segments is experienced (the width size represents the level of the hierarchy).
- the spatial dimension growth trajectory visualization of different resumes is distinguished by the fill color of the rectangular blocks it contains.
- the potential relationship graph is drawn based on a potential social network visualization algorithm.
- the algorithm uses mining to get The potential relationship between resumes, constructing a visual representation of the social network of the resume, and generating the potential relationship map to visually express the potential relationship between the original abstract resumes in the form of a network map.
- the specific steps of the algorithm are as follows:
- the resume uses rounded rectangles as its visual metaphorical representation.
- the rounded rectangle has a "name" in the basic information of the internal identification of the rectangle as a rectangle ID, and rectangles of different IDs represent different resumes.
- the rounded rectangles are connected by line segments to represent a certain degree of similarity between the growth trajectories of the resumes.
- the similarity of growth trajectories between resumes shows that the growth experiences between resumes are similar. For example, if the growth time of the personnel A and B represented by the resumes from the “departmental cadres” to the “office-level cadres” is similar, then A and B The growth trajectories are similar.
- the length of the line segment characterizes the size of the similarity: the shorter the line segment (the smaller the distance between the two rectangles), the greater the similarity; and vice versa.
- the similarity between A and B is characterized by the similarity matrix sim mentioned in the group potential social relationship mining module.
- the intersection of elements reflects the intersection relationship between the personnel represented by the resume, such as classmate relationship, fellowship, and colleagues.
- the organization chart is drawn based on the organization visualization algorithm.
- the algorithm extracts the unit intersection information between the resumes from the history information, converts the history with the unit intersection into the organizational relationship of the corresponding unit, and visualizes the relationship in the form of a table organization chart.
- the specific steps of the algorithm are as follows:
- the horizontal axis of the head is the personnel axis, which represents the personnel of the unit;
- the vertical axis of the table is the grade axis, which represents the rank of the unit, and the rank axis is arranged in descending order from top to bottom, that is, the higher the rank is, The higher the position.
- the form element is the person avatar represented by the resume.
- the horizontal row of the element represents the job title of the resume in the unit, and the column of the element represents the person represented by the resume.
- the table element has two states: 1 active state (the person's avatar is color), indicating that the unit and job in which the element is located is the current state of the person (for example, the person's current position in the unit); 2 inactive state (the person's avatar is Gray), indicating that the unit and position of the element is the historical status of the person (for example, the person has served in the corresponding position of the unit, but is no longer in that position).
- the module introduces human-computer interaction technology into the visual analysis environment for the history data, and on the basis of each mining module and the history information visualization module, helps the user to deeply understand the potential information in the history and the pattern features embodied in a large number of resumes, thereby Gain deep understanding.
- the module specifically includes the following steps:
- the function of the correlation analysis is provided from the perspective of human-computer interaction based on the trajectory growth time-space map, so that the user can jointly view the trajectory changes of the history from the two perspectives of time and space, thereby discovering the trajectory space-time mode.
- predicting the future growth direction of the history track based on the existing trajectory space-time mode is also an important part of interactive visual analysis.
- FIG. 12 based on the history track growth time-space map, the user can find the category patterns of different resume growth trajectories from the comparison display of multiple resumes, thereby quickly finding the trajectory categories of interest.
- the user can perceive the three stages of personal growth from the visualization of the official promotion as shown in Figure 12: growth period (early career, faster promotion), bottleneck period (mid-career, promotion encounters bottleneck), breakthrough period (At the end of his career, the breakthrough bottleneck continues to advance).
- the visual analysis environment is defined as follows: in the trajectory map at the same time, the growth trajectory of up to 3 resumes can be compared and analyzed, and different resumes are The spatio-temporal trajectory has a certain misalignment in each of the time axis and the spatial axis, thereby reducing the occlusion between different trajectories in the trajectory map without reducing the visual precision.
- Resume social network interactive visual analysis As shown in FIG. 13, based on the group's potential relationship map, the user can selectively select a target resume and a resume having a potential relationship with the resume to form a specific social network according to his or her own interests. At the same time, based on the social network, the human-computer interaction editing and viewing function is provided to guide the user to purposely view important potential relationships.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Human Resources & Organizations (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Strategic Management (AREA)
- Entrepreneurship & Innovation (AREA)
- Economics (AREA)
- General Business, Economics & Management (AREA)
- Tourism & Hospitality (AREA)
- Quality & Reliability (AREA)
- Operations Research (AREA)
- Marketing (AREA)
- Data Mining & Analysis (AREA)
- Development Economics (AREA)
- Educational Administration (AREA)
- Game Theory and Decision Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
起始时间 | 终止时间 | 地点 | 单位 | 职务 | 量化等级 |
1989 | 1992 | 湖南省宁乡县 | 卫生局党支部 | 书记 | 0 |
1992 | 1995 | 湖南省宁乡县 | 县委 | 书记 | 2 |
1995 | 1998 | 湖南省长沙市 | 市委 | 副市长 | 3 |
1995 | 2002 | 湖南省长沙市 | 市委 | 书记 | 4 |
2002 | 2010 | 湖南省 | 省委 | 副省长 | 5 |
2010 | 2014 | 北京市 | 市委 | 市长 | 6 |
Claims (12)
- 一种基于文本履历信息的信息可视化方法,其步骤为:1)对每一文本履历信息中的经历信息,进行经历等级量化计算,得到成长轨迹序列数据,并将该数据进行可视化;2)选取多份文本履历信息的成长轨迹序列数据进行关联计算,得到文本履历间的潜在社交关系,并将该潜在社交关系进行社交网络可视化;3)基于履历间的潜在社交关系,构建人员所在单位的组织层级可视化表达,将具有单位交集的履历转化成相应单位的组织层级关系,并将该组织层级关系进行组织机构可视化。
- 如权利要求1所述的方法,其特征在于如果履历为非结构化文本履历,则首先将其转换为结构化的文本履历信息,其方法为:1)对非结构化文本履历进行格式过滤,获得包含履历信息的纯履历文本;2)利用自然语言处理技术对纯履历文本进行分词与命名实体识别,然后进行履历特征要素抽取,处理得到包含履历要素的结构化文本块;3)将包含履历要素的结构化文本块进行格式转化,形成结构化的文本履历信息。
- 如权利要求2所述的方法,其特征在于所述结构化的文本履历信息包括:履历基本信息和经历信息表;所述履历基本信息包括姓名、性别、民族和出生地,所述经历信息表为一个表结构,表头包含开始时间、终止时间、地点、单位、职务字段。
- 如权利要求3所述的方法,其特征在于对于单位履历特征要素,采用关键字匹配算法进行履历特征要素的抽取:首先创建一单位关键词词典,所述单位关键词词典中每一行元素包括关键字和辅助关键字两部分信息,其中,辅助关键字包括R型和L型两种,多个辅助关键字用逗号相隔;然后利用单位关键词词典进行单位要素识别:当识别到了词典中的某一关键字,且其右侧无R型辅助关键字,同时左侧无L型辅助关键字时,则识别成功;反之,识别失败;对于其他履历特征要素,采取正则表达式匹配法进行履历特征要素的抽取。
- 如权利要求3所述的方法,其特征在于得到所述成长轨迹序列数据的方法为:1)对每一文本履历信息的经历信息表按照开始时间字段进行升序排序,得到有序经历信息表;2)逐条扫描有序经历信息表中的记录,从每一条记录中提取出地点、单位与职务字段,并将各个字段值分别与已有的经历等级量化库进行比对识别,对匹配的实体赋予设定的量化量级;3)将包含不同量级大小的经历段集合组成有序序列,得到所述成长轨迹序列数据。
- 如权利要求1或5所述的方法,其特征在于所述成长轨迹序列数据为一六元组,即<起始 时间,终止时间,地点,单位,职务,量化等级>。
- 如权利要求1~5任一所述的方法,其特征在于得到所述潜在社交关系的方法为:1)选取n份履历的成长轨迹序列数据,计算其中任意两个履历Mi与Mj之间的成长轨迹序列数据的相似性sim(i,j),得到一相似性矩阵sim;2)扫描矩阵sim,如果sim(i,j)>s0,则认为Mi与Mj的成长轨迹具有相似性,s0为相似性阈值;3)计算该n份履历的成长轨迹序列数据中任意两履历Mi与Mj之间的匹配度mch(i,j),并将二者的经历交集细节记录到一履历要素交集its(i,j);4)根据匹配度mch(i,j),判断Mi与Mj的成长经历之间是否具有交集,如果有,则根据对应的交集its(i,j)确定Mi与Mj之间的潜在关系,并且根据sim(i,j)确定Mi与Mj之间的密切程度。
- 如权利要求7所述的方法,其特征在于计算该n份履历的成长轨迹序列数据中任意两履历Mi与Mj之间的匹配度mch(i,j),并将二者的经历交集细节记录到一履历要素交集its(i,j)的方法为:1)设置两个初始值为0的计数器Ct和Cr:Ct代表Mi与Mj之间进行要素比对的次数:Cr代表Mi与Mj要素比对时出现相同要素的次数;定义一个差异要素成分列表err(i,j),其元素为Mi与Mj之间不相同的履历要素;定义一个履历要素交集列表its(i,j),其元素为Mi与Mj之间相同的履历要素;2)逐项扫描Mi和Mj的各基本信息要素,每扫描一个要素,Ct加1;同时,针对任意要素f,如果其值在Mi和Mj中相同,则Cr加1,并将该要素f添加至its(i,j);反之,则将该要素f添加至err(i,j);3)逐行扫描Mi和Mj的经历信息表,针对每一行经历段,逐项扫描该经历段所包含的时间、地点、单位、职务字段,且每扫描一个字段,Ct加1;同时,针对任意字段e,如果其值在Mi和Mj中相同,则Cr加1,并将该要素添加至its(i,j);反之,则将该要素添加至err(i,j);4)根据公式mch(i,j)=Cr/Ct计算Mi与Mj的匹配度mch(i,j)。
- 如权利要求1~5任一所述的方法,其特征在于所述基于履历间的潜在社交关系,构建人员所在单位的组织层级的组织机构生成方法,该方法为:1)将所述潜在社交关系记录为一矩阵R,矩阵元素Rij代表履历Mi和履历Mj之间的潜在社交关系;2)建立一组织机构库V,用于保存所有的组织机构及其成员信息;其中库中元素为树状结构,树的根节点为组织名称,叶节点为成员信息,其具体结构为:<组织名称,<成员1, 职务1,是否现任>,<成员2,职务2,是否现任>,…,<成员m,职务m,是否现任>>;3)遍历矩阵R,如果Rij所代表的履历Mi和履历Mj存在单位交集,则将该单位以及履历Mi和履历Mj保存至该组织机构库V;4)将V中的所有元素按照所述树状结构,采用组织机构可视化方法进行可视化表达。
- 如权利要求1或2所述的方法,其特征在于对每一成长轨迹序列数据进行时间维度以及空间维度的类型分析,得到对应文本履历的时空成长模式;其中,得到所述时空成长模式的方法为:首先定义履历随时间变迁的成长类型和履历随空间迁移的成长类型,并确定每一成长类型的特征;其中,随时间变迁的成长类型特征包括:等级时间跨度特征和或时序成长斜率特征,根据履历中的单位地理位置确定随空间迁移的成长类型特征;选取一部分成长轨迹序列数据作为样本数据,根据确定的成长类型特征标记其成长类型;利用机器学习分类器对样本数据进行分类训练,得到分类器模型参数,然后对未标记成长轨迹序列数据进行分类标记。
- 一种基于文本履历信息的智能可视分析系统,其特征在于包括个人成长经历量化模块、群体潜在社交关系挖掘模块、组织机构生成模块和履历信息可视化模块,其中:个人成长经历量化模块,用于对履历要素中的经历信息进行经历等级的量化计算,得到成长轨迹序列数据;群体潜在社交关系挖掘模块,用于对多份履历的成长轨迹序列数据进行关联计算,得到履历间的潜在社交关系;组织机构生成模块,用于以多份履历所代表群体的潜在社交关系为基础,从群体的单位交集信息中提取并还原出组织机构的层级信息;履历信息可视化模块,用于将履历的成长轨迹序列数据以及群体潜在社交关系挖掘模块、组织机构生成模块所输出的结果转化成信息可视化图。
- 如权利要求11所述的系统,其特征在于所述系统还包括文本履历预处理模块和个人成长模式挖掘模块;其中,文本履历预处理模块,用于将非结构化的文本履历数据进行预处理,抽取履历信息中的要素,得到结构化的履历要素XML数据;个人成长模式挖掘模块,用于对成长轨迹序列数据进行时间维度以及空间维度的类型分析,得到履历的时空成长模式。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/898,897 US20170200125A1 (en) | 2014-09-25 | 2014-10-15 | Information visualization method and intelligent visual analysis system based on text curriculum vitae information |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410496047.1A CN104318340B (zh) | 2014-09-25 | 2014-09-25 | 基于文本履历信息的信息可视化方法及智能可视分析系统 |
CN201410496047.1 | 2014-09-25 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2016045153A1 true WO2016045153A1 (zh) | 2016-03-31 |
Family
ID=52373568
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2014/088601 WO2016045153A1 (zh) | 2014-09-25 | 2014-10-15 | 基于文本履历信息的信息可视化方法及智能可视分析系统 |
Country Status (3)
Country | Link |
---|---|
US (1) | US20170200125A1 (zh) |
CN (1) | CN104318340B (zh) |
WO (1) | WO2016045153A1 (zh) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109344302A (zh) * | 2018-08-14 | 2019-02-15 | 中国平安人寿保险股份有限公司 | 一种组织架构信息的展示方法、存储介质和服务器 |
CN109635301A (zh) * | 2018-12-14 | 2019-04-16 | 湖南惟楚有才教育科技有限公司 | 一种教育资源管理方法及系统 |
CN109657039A (zh) * | 2018-11-15 | 2019-04-19 | 中山大学 | 一种基于双层BiLSTM-CRF的工作履历信息抽取方法 |
CN109766438A (zh) * | 2018-12-12 | 2019-05-17 | 平安科技(深圳)有限公司 | 简历信息提取方法、装置、计算机设备和存储介质 |
CN110781658A (zh) * | 2019-10-14 | 2020-02-11 | 北京字节跳动网络技术有限公司 | 简历解析方法、装置、电子设备和存储介质 |
CN111984784A (zh) * | 2020-07-17 | 2020-11-24 | 北京嘀嘀无限科技发展有限公司 | 人岗匹配方法、装置、电子设备和存储介质 |
CN112100237A (zh) * | 2020-09-04 | 2020-12-18 | 北京百度网讯科技有限公司 | 一种用户数据处理方法、装置、设备以及存储介质 |
CN113095075A (zh) * | 2021-04-02 | 2021-07-09 | 上海中通吉网络技术有限公司 | 一种简历文件解析方法 |
CN114708946A (zh) * | 2022-03-22 | 2022-07-05 | 北京蓝田医疗设备有限公司 | 一种目标导向性专项能力训练方法及装置 |
Families Citing this family (31)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104951545B (zh) * | 2015-06-23 | 2018-07-10 | 百度在线网络技术(北京)有限公司 | 输出对象的数据处理方法及装置 |
CN105260413A (zh) * | 2015-09-24 | 2016-01-20 | 广东小天才科技有限公司 | 信息处理方法及装置 |
CN105786999A (zh) * | 2016-02-17 | 2016-07-20 | 扬州大学 | 一种基于复杂网络关系的软件开发人员可视化推荐方法 |
US10692099B2 (en) * | 2016-04-11 | 2020-06-23 | International Business Machines Corporation | Feature learning on customer journey using categorical sequence data |
CN106844493A (zh) * | 2016-12-26 | 2017-06-13 | 中国科学院自动化研究所 | 面向本体的时空信息挖掘及可视化展示方法 |
CN106874456B (zh) * | 2017-02-14 | 2020-06-23 | 广州优视网络科技有限公司 | 人群优先级计算方法、装置及计算设备 |
US10833964B2 (en) * | 2017-03-13 | 2020-11-10 | Shenzhen Institutes Of Advanced Technology Chinese Academy Of Sciences | Visual analytical method and system for network system structure and network communication mode |
US11238363B2 (en) * | 2017-04-27 | 2022-02-01 | Accenture Global Solutions Limited | Entity classification based on machine learning techniques |
CN107392143B (zh) * | 2017-07-20 | 2019-12-27 | 中国科学院软件研究所 | 一种基于svm文本分类的简历精确解析方法 |
US10884980B2 (en) * | 2017-07-26 | 2021-01-05 | International Business Machines Corporation | Cognitive file and object management for distributed storage environments |
US10817515B2 (en) | 2017-07-26 | 2020-10-27 | International Business Machines Corporation | Cognitive data filtering for storage environments |
CN107679194B (zh) * | 2017-10-09 | 2020-04-10 | 东软集团股份有限公司 | 一种基于文本的实体关系构建方法、装置及设备 |
CN107656909B (zh) * | 2017-10-30 | 2021-06-01 | 北京明朝万达科技股份有限公司 | 一种基于文档混合特征的文档相似度判定方法和装置 |
CN107944915B (zh) * | 2017-11-21 | 2022-01-18 | 北京字节跳动网络技术有限公司 | 一种游戏用户行为分析方法及计算机可读存储介质 |
CN108319733B (zh) * | 2018-03-29 | 2020-08-25 | 华中师范大学 | 一种基于地图的教育大数据分析方法及系统 |
US11113324B2 (en) * | 2018-07-26 | 2021-09-07 | JANZZ Ltd | Classifier system and method |
CN109446235B (zh) * | 2018-10-18 | 2020-10-02 | 哈尔滨工业大学(深圳) | 多维高效用序列模式处理方法、装置和计算机设备 |
CN109754224A (zh) * | 2018-12-29 | 2019-05-14 | 贵州小爱机器人科技有限公司 | 人事关系图谱构建方法、装置以及计算机存储介质 |
CN109948447B (zh) * | 2019-02-21 | 2023-08-25 | 山东科技大学 | 基于视频图像识别的人物网络关系发现及演化呈现方法 |
CN110147360B (zh) * | 2019-04-03 | 2021-07-30 | 深圳价值在线信息科技股份有限公司 | 一种数据整合方法、装置、存储介质和服务器 |
CN110427406A (zh) * | 2019-08-10 | 2019-11-08 | 吴诚诚 | 组织机构相关人员关系的挖掘方法及装置 |
CN110610001B (zh) * | 2019-08-12 | 2024-01-23 | 大箴(杭州)科技有限公司 | 短文本完整性识别方法、装置、存储介质及计算机设备 |
CN111126951B (zh) * | 2019-12-11 | 2022-12-20 | 云南电网有限责任公司 | 一种基于数字化的企业干部人才决策方法 |
CN111177583A (zh) * | 2019-12-30 | 2020-05-19 | 山东合天智汇信息技术有限公司 | 一种基于社交平台的人脉分析方法及系统 |
US11829386B2 (en) | 2020-01-30 | 2023-11-28 | HG Insights, Inc. | Identifying anonymized resume corpus data pertaining to the same individual |
CN111782970B (zh) * | 2020-07-23 | 2024-03-22 | 广州汇智通信技术有限公司 | 一种数据分析方法和装置 |
CN112364626B (zh) * | 2020-11-25 | 2023-09-01 | 广东电网有限责任公司佛山供电局 | 一种安全措施智能管理方法及系统 |
CN113517074B (zh) * | 2020-12-10 | 2023-09-12 | 中国人民解放军战略支援部队信息工程大学 | 一种流行病患者信息三维空间可视化方法 |
CN113449524B (zh) * | 2021-04-01 | 2023-04-07 | 山东英信计算机技术有限公司 | 一种命名实体识别方法、系统、设备以及介质 |
CN113486003B (zh) * | 2021-06-02 | 2024-03-19 | 广州数说故事信息科技有限公司 | 数据可视化时考虑异常值的企业数据集处理方法及系统 |
CN113673943B (zh) * | 2021-07-19 | 2023-02-10 | 清华大学深圳国际研究生院 | 一种基于履历大数据的人员任免辅助决策方法及系统 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101546331A (zh) * | 2009-05-07 | 2009-09-30 | 刘健 | 获取有助检索的特征、评价相关事物的价值的系统及方法 |
US7685151B2 (en) * | 2006-04-12 | 2010-03-23 | International Business Machines Corporation | Coordinated employee records with version history and transition ownership |
CN104036360A (zh) * | 2014-06-19 | 2014-09-10 | 中国科学院软件研究所 | 一种基于磁卡考勤行为的用户数据处理系统及处理方法 |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1167026C (zh) * | 2001-01-22 | 2004-09-15 | 前程无忧网络信息技术(北京)有限公司上海分公司 | 汉语个人简历信息处理系统和方法 |
CN102999523A (zh) * | 2011-09-16 | 2013-03-27 | 陆敏 | 一种才智数字化的方法 |
CN102999794A (zh) * | 2011-09-16 | 2013-03-27 | 陆敏 | 人力资源人工智能的方法 |
-
2014
- 2014-09-25 CN CN201410496047.1A patent/CN104318340B/zh active Active
- 2014-10-15 US US14/898,897 patent/US20170200125A1/en not_active Abandoned
- 2014-10-15 WO PCT/CN2014/088601 patent/WO2016045153A1/zh active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7685151B2 (en) * | 2006-04-12 | 2010-03-23 | International Business Machines Corporation | Coordinated employee records with version history and transition ownership |
CN101546331A (zh) * | 2009-05-07 | 2009-09-30 | 刘健 | 获取有助检索的特征、评价相关事物的价值的系统及方法 |
CN104036360A (zh) * | 2014-06-19 | 2014-09-10 | 中国科学院软件研究所 | 一种基于磁卡考勤行为的用户数据处理系统及处理方法 |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109344302A (zh) * | 2018-08-14 | 2019-02-15 | 中国平安人寿保险股份有限公司 | 一种组织架构信息的展示方法、存储介质和服务器 |
CN109344302B (zh) * | 2018-08-14 | 2023-11-28 | 中国平安人寿保险股份有限公司 | 一种组织架构信息的展示方法、存储介质和服务器 |
CN109657039B (zh) * | 2018-11-15 | 2023-04-07 | 中山大学 | 一种基于双层BiLSTM-CRF的工作履历信息抽取方法 |
CN109657039A (zh) * | 2018-11-15 | 2019-04-19 | 中山大学 | 一种基于双层BiLSTM-CRF的工作履历信息抽取方法 |
CN109766438A (zh) * | 2018-12-12 | 2019-05-17 | 平安科技(深圳)有限公司 | 简历信息提取方法、装置、计算机设备和存储介质 |
CN109635301A (zh) * | 2018-12-14 | 2019-04-16 | 湖南惟楚有才教育科技有限公司 | 一种教育资源管理方法及系统 |
CN110781658A (zh) * | 2019-10-14 | 2020-02-11 | 北京字节跳动网络技术有限公司 | 简历解析方法、装置、电子设备和存储介质 |
CN110781658B (zh) * | 2019-10-14 | 2023-08-25 | 抖音视界有限公司 | 简历解析方法、装置、电子设备和存储介质 |
CN111984784A (zh) * | 2020-07-17 | 2020-11-24 | 北京嘀嘀无限科技发展有限公司 | 人岗匹配方法、装置、电子设备和存储介质 |
CN111984784B (zh) * | 2020-07-17 | 2024-03-12 | 北京嘀嘀无限科技发展有限公司 | 人岗匹配方法、装置、电子设备和存储介质 |
CN112100237A (zh) * | 2020-09-04 | 2020-12-18 | 北京百度网讯科技有限公司 | 一种用户数据处理方法、装置、设备以及存储介质 |
CN112100237B (zh) * | 2020-09-04 | 2023-08-15 | 北京百度网讯科技有限公司 | 一种用户数据处理方法、装置、设备以及存储介质 |
CN113095075A (zh) * | 2021-04-02 | 2021-07-09 | 上海中通吉网络技术有限公司 | 一种简历文件解析方法 |
CN114708946B (zh) * | 2022-03-22 | 2022-10-11 | 北京蓝田医疗设备有限公司 | 一种目标导向性专项能力训练方法及装置 |
CN114708946A (zh) * | 2022-03-22 | 2022-07-05 | 北京蓝田医疗设备有限公司 | 一种目标导向性专项能力训练方法及装置 |
Also Published As
Publication number | Publication date |
---|---|
CN104318340B (zh) | 2017-07-07 |
US20170200125A1 (en) | 2017-07-13 |
CN104318340A (zh) | 2015-01-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2016045153A1 (zh) | 基于文本履历信息的信息可视化方法及智能可视分析系统 | |
Meng et al. | What makes an online review more helpful: an interpretation framework using XGBoost and SHAP values | |
US11899674B2 (en) | Systems and methods to determine and utilize conceptual relatedness between natural language sources | |
Khan et al. | A survey on scholarly data: From big data perspective | |
Shi et al. | Prospecting information extraction by text mining based on convolutional neural networks–a case study of the Lala copper deposit, China | |
Tanwar et al. | Unravelling unstructured data: A wealth of information in big data | |
CN104850601B (zh) | 基于图数据库的警务实时分析应用平台及其构建方法 | |
Baglatzi et al. | Semantifying OpenStreetMap. | |
Zhang et al. | Data mining applications in university information management system development | |
Theocharis et al. | Knowledge management systems in the public sector: Critical issues | |
Ait-Mlouk et al. | Winfra: A web-based platform for semantic data retrieval and data analytics | |
Chen et al. | Data analysis and knowledge discovery in web recruitment—based on big data related jobs | |
Li et al. | Construction of sentimental knowledge graph of Chinese government policy comments | |
CN112632223A (zh) | 案事件知识图谱构建方法及相关设备 | |
Miller et al. | Digging into human rights violations: Data modelling and collective memory | |
Wang et al. | Eliciting big data requirement from big data itself: A task-directed approach | |
Xu et al. | Research on Tibetan hot words, sensitive words tracking and public opinion classification | |
Fuller et al. | Structuring, recording, and analyzing historical networks in the china biographical database | |
Yu et al. | Data service generation framework from heterogeneous printed forms using semantic link discovery | |
Chuprina et al. | A way how to impart data science skills to computer science students exemplified by obda-systems development | |
Zhomartkyzy et al. | The development of information models and methods of university scientific knowledge management | |
Gurcan et al. | Big data research landscape: A meta-analysis and literature review from 2009 to 2018 | |
Luo | [Retracted] Analysis the Innovation Path on Psychological Ideological with Political Teaching in Universities by Big Data in New Era | |
Xiao | Educational Information Recommendation System for College Design Based on Apriori Algorithm | |
Lytras et al. | Innovations, developments, and applications of semantic web and information systems |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 14898897 Country of ref document: US |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 14902523 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205N DATED 04.08.2017) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 14902523 Country of ref document: EP Kind code of ref document: A1 |