CN111666377A - Talent portrait construction method and system based on big data modeling - Google Patents

Talent portrait construction method and system based on big data modeling Download PDF

Info

Publication number
CN111666377A
CN111666377A CN202010493764.4A CN202010493764A CN111666377A CN 111666377 A CN111666377 A CN 111666377A CN 202010493764 A CN202010493764 A CN 202010493764A CN 111666377 A CN111666377 A CN 111666377A
Authority
CN
China
Prior art keywords
data
talent
sample
grade
honor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010493764.4A
Other languages
Chinese (zh)
Inventor
杨灵运
杨文峰
张昌福
邓生雄
张磊
李琳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guizhou Casicloud Technology Co ltd
Original Assignee
Guizhou Casicloud Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guizhou Casicloud Technology Co ltd filed Critical Guizhou Casicloud Technology Co ltd
Priority to CN202010493764.4A priority Critical patent/CN111666377A/en
Publication of CN111666377A publication Critical patent/CN111666377A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3334Selection or weighting of terms from queries, including natural language queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3343Query execution using phonetics

Abstract

The invention discloses a talent portrait construction method and a system based on big data modeling, which comprises the steps of obtaining sample talent data, carrying out talent weight refinement on the sample talent data to obtain a talent data set, and constructing a sample talent data model by using the talent data set; acquiring voice data of interviewers, and performing textualization processing on the voice data to obtain text data; screening the text data to obtain a screening data set, and matching the screening data set with the talent data set to obtain an image set; constructing a talent portrait of the portrait set by using the sample talent data model; the invention can solve the defect that the authenticity of the acquired information is limited because the information of the personnel is acquired through the resume in the prior scheme and the authenticity of the resume cannot be determined, and can effectively improve the accuracy of talent portrayal.

Description

Talent portrait construction method and system based on big data modeling
Technical Field
The invention belongs to the field of big data, relates to a talent portrait technology, and particularly relates to a talent portrait construction method and system based on big data modeling.
Background
The user portrait is a tagged user model abstracted according to information such as social attributes, living habits, consumption behaviors and the like of users, the core work of constructing the user portrait is tagged to the users in real time, and the notebooks are highly refined feature marks obtained by analyzing the information of the users.
The significance of talent portraits is well known and agreed, but in the practice of different industries and enterprises, the talent portraits project groups are applied and bloomed respectively under different enterprise organizational cultural forms due to different cultural background and strategic requirements and different concerns and methodologies, the precipitation and accumulation of data directions in the field of human resources are different, and the difference of organizational management modes can cause personalized differences in the understanding of talent portraits.
The correct and efficient talent portrait construction method and system can help enterprises to accurately obtain effective talents and effectively save manpower and material resources, but the current talent portrait method is not accurate and efficient enough due to numerous influence factors of talent portraits, so that personnel obtained by the method do not meet the requirements of the enterprises and cause inestimable loss; in order to solve the above-mentioned drawbacks, a solution is now provided.
Disclosure of Invention
The invention aims to provide a talent portrait construction method and system based on big data modeling.
The technical problem to be solved by the invention is as follows:
(1) how to model with big data; acquiring sample talent data, performing talent weight refinement on the sample talent data to obtain a talent weight data set, and constructing a sample talent data model by using the talent weight data set; the sample talent data model obtained by processing the sample data in the early stage can provide powerful data support for the subsequent construction of talent portraits, and the accuracy and the efficiency of talent portraits can be effectively improved;
(2) how to process talent portraits on the acquired data; obtaining voice data of interviewees, performing textualization processing on the voice data to obtain text data, screening the text data to obtain a screening data set, matching the screening data set with the talent data set to obtain an image set, and constructing a talent portrait of the image set by using the sample talent data model; the invention can solve the defect that the authenticity of the acquired information is limited because the information of the personnel is acquired through the resume in the prior scheme and the authenticity of the resume cannot be determined, and can effectively improve the accuracy of talent portrayal.
The purpose of the invention can be realized by the following technical scheme:
a talent portrait construction method based on big data modeling comprises the following steps:
s1: acquiring sample talent data, performing talent weight refinement on the sample talent data to obtain a talent data set, and constructing a sample talent data model by using the talent data set;
s2: acquiring voice data of interviewers, and performing textualization processing on the voice data to obtain text data;
s3: screening the text data to obtain a screening data set, and matching the screening data set with the talent data set to obtain an image set;
s4: and constructing the talent portrait of the portrait set by utilizing the sample talent data model.
Further, the talent weight refining of the sample talent data to obtain a talent data set includes:
s21: acquiring talent weight phrases in the sample talent data, and performing phrase division on the sample talent data to obtain a divided data set, wherein the divided data set comprises a sample college information set, a sample hundred-strength enterprise set and a sample honor set;
s22: performing grade division on the sample college information set to obtain a college grade set, performing score marking on the college grade set by using a preset score algorithm to obtain a college score set, combining the college grade set and the college score set to obtain sample college data, and marking colleges in the college grade set as Gi, i is 1,2,3 … n;
s23: carrying out grade division on the sample hundred-strength enterprise set to obtain an enterprise grade set, carrying out grade marking on the sample hundred-strength enterprise set by using a preset grade algorithm to obtain an enterprise grade set, combining the enterprise grade set and the enterprise grade set to obtain sample enterprise data, and marking enterprises in the enterprise grade set as Qi, wherein i is 1,2, and 3 … n;
s24: grade division is carried out on the sample honor sets to obtain the honor grade sets, score marking is carried out on the sample honor sets by utilizing a preset score algorithm to obtain the honor score sets, the honor grade sets and the honor score sets are combined to obtain sample honor data, and the honor items in the honor grade sets are marked as Ri, i is 1,2 and 3 … n;
s25: and combining the sample college data, the sample enterprise data and the sample honor data to obtain a talent data set.
Further, the performing the text processing on the voice data to obtain text data includes:
s31, sampling the voice data by using a preset sampling rate and sampling digit to obtain a sampling data set;
s32, quantizing the sampling data set to obtain a quantized data set;
s33, pre-emphasis is carried out on the quantized data set to obtain a first feature set;
s34, performing framing and windowing on the first feature set to obtain a second feature set;
s35, performing discrete Fourier transform processing on the second feature set to obtain a third feature set;
and S36, performing textualization processing on the third feature set by using a text feature coefficient algorithm to obtain text data.
Further, the matching the interviewer keyword set with the talent weight data set to obtain an image set comprises:
s41: carrying out phrase division on the text data to obtain a phrase data set, and marking phrases in the phrase data set as Wi, i is 1,2,3 … n;
s42: screening the phrase data set according to preset keywords to obtain a screened data set, wherein the screened data set comprises education data, working data and prize winning data, and marking screened phrases in the screened data set as Wij, i is 1,2,3 … n, j is 1,2,3 … n;
s43: comparing the screening data set with the talent data set, and storing phrases with the same comparison result to obtain matching data;
s44: and combining the matched data to obtain an image set.
Further, constructing a talent representation of the interviewer using the set of representations includes:
s51, matching the image set with the sample talent data model to obtain education data scores, work data scores and prize winning data scores corresponding to the education data, the work data and the prize winning data in the image set;
s52, matching the education data score, the work data score and the prize winning data score with a preset score grade to obtain the education grade data, the work grade data and the prize winning grade data;
and S53, dividing and combining the education grade data, the work grade data and the prize winning grade data according to a preset proportion to obtain the talent portrait of the interviewee.
A talent portrait construction system based on big data modeling comprises a sample data processing module, a voice data processing module, a text data processing module and a portrait construction module;
the sample data processing module is used for acquiring sample talent data, performing talent weight refinement on the sample talent data to obtain a talent data set, and constructing a sample talent data model by using the talent data set;
the voice data processing module is used for acquiring voice data of interviewers and performing textualization processing on the voice data to obtain text data;
the text data processing module is used for screening the text data to obtain a screening data set, and matching the screening data set with the talent data set to obtain an image set;
the portrait construction module is used for constructing the talent portrait of the portrait set by utilizing the sample talent data model.
The invention has the beneficial effects that:
(1) on one aspect of the invention, talent weight extraction is carried out on sample talent data by obtaining the sample talent data to obtain a talent data set, and a sample talent data model is constructed by utilizing the talent data set; the sample talent data model obtained by processing the sample data in the early stage can provide powerful data support for the subsequent construction of talent portraits, and the accuracy and the efficiency of talent portraits can be effectively improved;
(2) on the other hand, the voice data of the interviewer is acquired, the voice data is subjected to text processing to obtain text data, the text data is screened to obtain a screening data set, the screening data set is matched with the talent data set to obtain an image set, and the talent portrait of the image set is constructed by using the sample talent data model; the invention can solve the defect that the authenticity of the acquired information is limited because the information of the personnel is acquired through the resume in the prior scheme and the authenticity of the resume cannot be determined, and can effectively improve the accuracy of talent portrayal.
Drawings
In order to facilitate understanding for those skilled in the art, the present invention will be further described with reference to the accompanying drawings.
FIG. 1 is a flow chart of a talent portrait construction method based on big data modeling according to the present invention.
Detailed Description
As shown in FIG. 1, a talent portrait construction method based on big data modeling includes:
s1: acquiring sample talent data, performing talent weight refinement on the sample talent data to obtain a talent data set, and constructing a sample talent data model by using the talent data set;
s2: acquiring voice data of interviewers, and performing textualization processing on the voice data to obtain text data;
s3: screening the text data to obtain a screening data set, and matching the screening data set with the talent data set to obtain an image set;
s4: and constructing the talent portrait of the portrait set by utilizing the sample talent data model.
Will sample talent data carries out talent weight and refines, obtains the talent data set and includes:
s21: acquiring talent weight phrases in the sample talent data, and performing phrase division on the sample talent data to obtain a divided data set, wherein the divided data set comprises a sample college information set, a sample hundred-strength enterprise set and a sample honor set;
s22: performing grade division on the sample college information set to obtain a college grade set, performing score marking on the college grade set by using a preset score algorithm to obtain a college score set, combining the college grade set and the college score set to obtain sample college data, and marking colleges in the college grade set as Gi, i is 1,2,3 … n;
s23: carrying out grade division on the sample hundred-strength enterprise set to obtain an enterprise grade set, carrying out grade marking on the sample hundred-strength enterprise set by using a preset grade algorithm to obtain an enterprise grade set, combining the enterprise grade set and the enterprise grade set to obtain sample enterprise data, and marking enterprises in the enterprise grade set as Qi, wherein i is 1,2, and 3 … n;
s24: grade division is carried out on the sample honor sets to obtain the honor grade sets, score marking is carried out on the sample honor sets by utilizing a preset score algorithm to obtain the honor score sets, the honor grade sets and the honor score sets are combined to obtain sample honor data, and the honor items in the honor grade sets are marked as Ri, i is 1,2 and 3 … n;
s25: and combining the sample college data, the sample enterprise data and the sample honor data to obtain a talent data set.
In the embodiment of the invention, the sample college information set can be the ranking list of the first five hundred colleges, the sample hundred-powerful enterprise set can be the ranking list of the first five hundred colleges, and the sample honor set can be the honor of state-level, provincial-level and city-level certificates; when the sample college information set, the sample Baiqiang enterprise set and the sample honor set are graded, the number of the sample college information set, the sample Baiqiang enterprise set and the sample honor set is graded by taking 100 as a grade, for example, the interval [1,100) is a first grade, [100,200) is a second grade, [200,300) is a third grade, [3,400) is a fourth grade, [4,500] is a fifth grade, and the grades from the first grade to the fifth grade are sequentially reduced;
respectively setting scores of all colleges in the sample college information set, all enterprises in the sample Baiqiang enterprise set and all honors in the sample honor set by using a preset score algorithm, wherein all colleges in the sample college information set are set in a way of decreasing the scores from front to back, for example, the college ranked first is 500, the college ranked second is 499; each enterprise in the sample Baiqiang enterprise set is set in a descending manner by scores from front to back, for example, the enterprise ranked first is 500 scores, the enterprise ranked second is 499 scores. Setting each honor in the sample honor set in a descending way by a score from front to back, for example, the first honor item is ranked 500 points, the second honor item is ranked 499 points, and so on;
performing the text processing on the voice data to obtain text data includes:
s31, sampling the voice data by using a preset sampling rate and sampling digit to obtain a sampling data set;
s32, quantizing the sampling data set to obtain a quantized data set;
s33, pre-emphasis is carried out on the quantized data set to obtain a first feature set;
s34, performing framing and windowing on the first feature set to obtain a second feature set;
s35, performing discrete Fourier transform processing on the second feature set to obtain a third feature set;
and S36, performing textualization processing on the third feature set by using a text feature coefficient algorithm to obtain text data.
In the embodiment of the invention, the preset sampling rate and the sampling digit number are respectively 16kHz and 16k, the preset sampling rate and the sampling digit number convert the sound continuous waveform in the voice data into discrete data points, the data points are stored by amplitude values, and the amplitude values are quantized into integers, wherein the quantization is in the field of digital signal processing;
the pre-emphasis is used for increasing the energy of the high-frequency part of the sound, the energy of the high-frequency part is strengthened to enable the high-frequency formant to be better utilized, so that the identification accuracy is improved, the pre-emphasis can be realized by a first-order high-pass filter, in the time domain, if the input signal is x [ n ], n represents the number of the input signals, and in the formula, the value of a preset coefficient mu is between 0.9 and 1.0, usually 0.97, the filter is represented as y [ n ] ═ x [ n-1 ];
the frame represents that N sampling points are grouped into an observation unit, in general, the value of N is 256 or 512, the covered time is about 20-30ms, an overlap region is arranged between two adjacent frames for avoiding overlarge change of the two adjacent frames, the overlap region comprises M sampling points, in general, the value of M is 1/2 or 1/3 of N, the sampling frequency of a voice signal adopted for voice recognition is 8KHz or 16KHz, and if the frame length is set to be 8KHz, the corresponding time length is 256/8000 × 1000 ═ 32 ms;
the windowing refers to that the sound in daily life is generally a non-stationary signal, the statistical characteristic of the signal is not fixed and constant, but the signal can be considered to be stationary within a short period of time, the window is described by three parameters of window length, offset and shape, each windowed sound signal is called a frame, the millisecond number of each frame is called a frame length, and the distance between the left boundaries of two adjacent frames is called frame shift;
the discrete Fourier transform refers to transforming a signal from a time domain to a frequency domain so as to research the frequency spectrum structure and the change rule of the signal, performing fast Fourier transform on each frame signal in the voice data after framing and windowing to obtain the frequency spectrum of each frame, and performing modular squaring on the frequency spectrum of the voice signal to obtain the power spectrum of the voice signal;
the text characteristic coefficient algorithm is a Mel frequency cepstrum coefficient algorithm.
Matching the interviewer keyword set with the talent weight data set to obtain an portrait set comprises:
s41: carrying out phrase division on the text data to obtain a phrase data set, and marking phrases in the phrase data set as Wi, i is 1,2,3 … n;
s42: screening the phrase data set according to preset keywords to obtain a screened data set, wherein the screened data set comprises education data, working data and prize winning data, and marking screened phrases in the screened data set as Wij, i is 1,2,3 … n, j is 1,2,3 … n;
s43: comparing the screening data set with the talent data set, and storing phrases with the same comparison result to obtain matching data;
s44: and combining the matched data to obtain an image set.
In the embodiment of the invention, the names in the text data are extracted, the names in the text data include but are not limited to schools, companies, prize-winning names, participating activities and the like of interviewees, a phrase data set is obtained, and the phrase data set is screened and matched according to preset high-efficiency keywords, enterprise keywords and honor keywords, so that an image set of the interviewees is obtained.
The talent portrait construction method for interviewers by utilizing the portrait collection comprises the following steps:
s51, matching the image set with the sample talent data model to obtain education data scores, work data scores and prize winning data scores corresponding to the education data, the work data and the prize winning data in the image set;
s52, matching the education data score, the work data score and the prize winning data score with a preset score grade to obtain the education grade data, the work grade data and the prize winning grade data;
and S53, dividing and combining the education grade data, the work grade data and the prize winning grade data according to a preset proportion to obtain the talent portrait of the interviewee.
In the embodiment of the invention, the efficient keywords, the enterprise keywords and the honor keywords in the portrait set are matched with the sample talent data model to obtain school scores and school grades learned by the interviewees, enterprise scores and enterprise grades of work and honor scores and honor grades of awards, the talent scores of the interviewees can be obtained through the school scores, the enterprise scores and the honor scores, and talent grades of the interviewees can be obtained through the school grades, the enterprise grades and the honor grades; the talent portrayal of the interviewee is obtained through the talent score and the talent grade.
A talent portrait construction system based on big data modeling comprises a sample data processing module, a voice data processing module, a text data processing module and a portrait construction module;
the sample data processing module is used for acquiring sample talent data, performing talent weight refinement on the sample talent data to obtain a talent data set, and constructing a sample talent data model by using the talent data set;
the voice data processing module is used for acquiring voice data of interviewers and performing textualization processing on the voice data to obtain text data;
the text data processing module is used for screening the text data to obtain a screening data set, and matching the screening data set with the talent data set to obtain an image set;
the portrait construction module is used for constructing the talent portrait of the portrait set by utilizing the sample talent data model.
The working steps of the embodiment of the invention comprise:
acquiring sample talent data, performing talent weight refinement on the sample talent data to obtain a talent data set, and constructing a sample talent data model by using the talent data set; wherein, will sample talent data carries out talent weight and refines, obtains the talent data set and includes: acquiring talent weight phrases in the sample talent data, and performing phrase division on the sample talent data to obtain a divided data set, wherein the divided data set comprises a sample college information set, a sample hundred-strength enterprise set and a sample honor set; performing grade division on the sample college information set to obtain a college grade set, performing score marking on the college grade set by using a preset score algorithm to obtain a college score set, combining the college grade set and the college score set to obtain sample college data, and marking colleges in the college grade set as Gi, i is 1,2,3 … n; carrying out grade division on the sample hundred-strength enterprise set to obtain an enterprise grade set, carrying out grade marking on the sample hundred-strength enterprise set by using a preset grade algorithm to obtain an enterprise grade set, combining the enterprise grade set and the enterprise grade set to obtain sample enterprise data, and marking enterprises in the enterprise grade set as Qi, wherein i is 1,2, and 3 … n; grade division is carried out on the sample honor sets to obtain the honor grade sets, score marking is carried out on the sample honor sets by utilizing a preset score algorithm to obtain the honor score sets, the honor grade sets and the honor score sets are combined to obtain sample honor data, and the honor items in the honor grade sets are marked as Ri, i is 1,2 and 3 … n; and combining the sample college data, the sample enterprise data and the sample honor data to obtain a talent data set.
Acquiring voice data of interviewers, and performing textualization processing on the voice data to obtain text data; performing text processing on the voice data to obtain text data includes: sampling the voice data by using a preset sampling rate and a preset sampling digit to obtain a sampling data set; quantizing the sampling data set to obtain a quantized data set; pre-emphasis is carried out on the quantitative data set to obtain a first feature set; performing frame windowing on the first feature set to obtain a second feature set; performing discrete Fourier transform processing on the second feature set to obtain a third feature set; and performing textualization processing on the third feature set by using a text feature coefficient algorithm to obtain text data.
In the embodiment of the invention, the preset sampling rate and the sampling digit number are respectively 16kHz and 16k, the preset sampling rate and the sampling digit number convert the sound continuous waveform in the voice data into discrete data points, the data points are stored by amplitude values, and the amplitude values are quantized into integers, wherein the quantization is in the field of digital signal processing;
the pre-emphasis is used for increasing the energy of the high-frequency part of the sound, the energy of the high-frequency part is strengthened to enable the high-frequency formant to be better utilized, so that the identification accuracy is improved, the pre-emphasis can be realized by a first-order high-pass filter, in the time domain, if the input signal is x [ n ], n represents the number of the input signals, and in the formula, the value of a preset coefficient mu is between 0.9 and 1.0, usually 0.97, the filter is represented as y [ n ] ═ x [ n-1 ];
the frame represents that N sampling points are grouped into an observation unit, in general, the value of N is 256 or 512, the covered time is about 20-30ms, an overlap region is arranged between two adjacent frames for avoiding overlarge change of the two adjacent frames, the overlap region comprises M sampling points, in general, the value of M is 1/2 or 1/3 of N, the sampling frequency of a voice signal adopted for voice recognition is 8KHz or 16KHz, and if the frame length is set to be 8KHz, the corresponding time length is 256/8000 × 1000 ═ 32 ms;
the windowing refers to that the sound in daily life is generally a non-stationary signal, the statistical characteristic of the signal is not fixed and constant, but the signal can be considered to be stationary within a short period of time, the window is described by three parameters of window length, offset and shape, each windowed sound signal is called a frame, the millisecond number of each frame is called a frame length, and the distance between the left boundaries of two adjacent frames is called frame shift;
the discrete Fourier transform refers to transforming a signal from a time domain to a frequency domain so as to research the frequency spectrum structure and the change rule of the signal, performing fast Fourier transform on each frame signal in the voice data after framing and windowing to obtain the frequency spectrum of each frame, and performing modular squaring on the frequency spectrum of the voice signal to obtain the power spectrum of the voice signal;
the text characteristic coefficient algorithm is a Mel frequency cepstrum coefficient algorithm.
Screening the text data to obtain a screening data set, and matching the screening data set with the talent data set to obtain an image set; wherein, will interviewer keyword set with talent weight data set matches, and it includes to obtain portraits set: carrying out phrase division on the text data to obtain a phrase data set, and marking phrases in the phrase data set as Wi, i is 1,2,3 … n; screening the phrase data set according to preset keywords to obtain a screened data set, wherein the screened data set comprises education data, working data and prize winning data, and marking screened phrases in the screened data set as Wij, i is 1,2,3 … n, j is 1,2,3 … n; comparing the screening data set with the talent data set, and storing phrases with the same comparison result to obtain matching data; and combining the matched data to obtain an image set.
Utilizing the sample talent data model to construct a talent portrait of the portrait collection, wherein utilizing the portrait collection to construct a talent portrait of an interviewer comprises: matching the image set with the sample talent data model to obtain education data values, work data values and prize winning data values corresponding to the image set education data, the work data and the prize winning data; matching the education data score, the work data score and the prize winning data score with a preset score grade to obtain the education grade data, the work grade data and the prize winning grade data; and dividing and combining the education grade data, the work grade data and the prize winning grade data according to a preset proportion to obtain the talent portrait of the interviewee.
A talent portrait construction system based on big data modeling comprises a sample data processing module, a voice data processing module, a text data processing module and a portrait construction module; the sample data processing module is used for acquiring sample talent data, performing talent weight refinement on the sample talent data to obtain a talent data set, and constructing a sample talent data model by using the talent data set; the voice data processing module is used for acquiring voice data of interviewers and performing textualization processing on the voice data to obtain text data; the text data processing module is used for screening the text data to obtain a screening data set, and matching the screening data set with the talent data set to obtain an image set; the image construction module is used for constructing the talent image of the image set by using the sample talent data model;
on one aspect of the invention, talent weight extraction is carried out on sample talent data by obtaining the sample talent data to obtain a talent data set, and a sample talent data model is constructed by utilizing the talent data set; the sample talent data model obtained by processing the sample data in the early stage can provide powerful data support for the subsequent construction of talent portraits, and the accuracy and the efficiency of talent portraits can be effectively improved;
on the other hand, the voice data of the interviewer is acquired, the voice data is subjected to text processing to obtain text data, the text data is screened to obtain a screening data set, the screening data set is matched with the talent data set to obtain an image set, and the talent portrait of the image set is constructed by using the sample talent data model; the invention can solve the defect that the authenticity of the acquired information is limited because the information of the personnel is acquired through the resume in the prior scheme and the authenticity of the resume cannot be determined, and can effectively improve the accuracy of talent portrayal.
The foregoing is merely exemplary and illustrative of the present invention and various modifications, additions and substitutions may be made by those skilled in the art to the specific embodiments described without departing from the scope of the invention as defined in the following claims.

Claims (6)

1. A talent portrait construction method based on big data modeling is characterized by comprising the following steps:
s1: acquiring sample talent data, performing talent weight refinement on the sample talent data to obtain a talent data set, and constructing a sample talent data model by using the talent data set;
s2: acquiring voice data of interviewers, and performing textualization processing on the voice data to obtain text data;
s3: screening the text data to obtain a screening data set, and matching the screening data set with the talent data set to obtain an image set;
s4: and constructing the talent portrait of the portrait set by utilizing the sample talent data model.
2. The method for constructing a talent portrait based on big data modeling as claimed in claim 1, wherein said extracting the sample talent data by talent weight to obtain a talent data set comprises:
s21: acquiring talent weight phrases in the sample talent data, and performing phrase division on the sample talent data to obtain a divided data set, wherein the divided data set comprises a sample college information set, a sample hundred-strength enterprise set and a sample honor set;
s22: performing grade division on the sample college information set to obtain a college grade set, performing score marking on the college grade set by using a preset score algorithm to obtain a college score set, combining the college grade set and the college score set to obtain sample college data, and marking colleges in the college grade set as Gi, i is 1,2,3 … n;
s23: carrying out grade division on the sample hundred-strength enterprise set to obtain an enterprise grade set, carrying out grade marking on the sample hundred-strength enterprise set by using a preset grade algorithm to obtain an enterprise grade set, combining the enterprise grade set and the enterprise grade set to obtain sample enterprise data, and marking enterprises in the enterprise grade set as Qi, wherein i is 1,2, and 3 … n;
s24: grade division is carried out on the sample honor sets to obtain the honor grade sets, score marking is carried out on the sample honor sets by utilizing a preset score algorithm to obtain the honor score sets, the honor grade sets and the honor score sets are combined to obtain sample honor data, and the honor items in the honor grade sets are marked as Ri, i is 1,2 and 3 … n;
s25: and combining the sample college data, the sample enterprise data and the sample honor data to obtain a talent data set.
3. The method for constructing a talent portrait based on big data modeling as claimed in claim 1, wherein said textualizing said speech data to obtain textual data comprises:
s31, sampling the voice data by using a preset sampling rate and sampling digit to obtain a sampling data set;
s32, quantizing the sampling data set to obtain a quantized data set;
s33, pre-emphasis is carried out on the quantized data set to obtain a first feature set;
s34, performing framing and windowing on the first feature set to obtain a second feature set;
s35, performing discrete Fourier transform processing on the second feature set to obtain a third feature set;
and S36, performing textualization processing on the third feature set by using a text feature coefficient algorithm to obtain text data.
4. The method of claim 1, wherein the matching the filtered dataset with the talent dataset to obtain an image set comprises:
s41: carrying out phrase division on the text data to obtain a phrase data set, and marking phrases in the phrase data set as Wi, i is 1,2,3 … n;
s42: screening the phrase data set according to preset keywords to obtain a screened data set, wherein the screened data set comprises education data, working data and prize winning data, and marking screened phrases in the screened data set as Wij, i is 1,2,3 … n, j is 1,2,3 … n;
s43: comparing the screening data set with the talent data set, and storing phrases with the same comparison result to obtain matching data;
s44: and combining the matched data to obtain an image set.
5. The method for constructing a talent representation based on big data modeling as claimed in claim 1, wherein said constructing talent representations of said set of representations using said sample talent data model comprises:
s51, matching the image set with the sample talent data model to obtain education data scores, work data scores and prize winning data scores corresponding to the education data, the work data and the prize winning data in the image set;
s52, matching the education data score, the work data score and the prize winning data score with a preset score grade to obtain the education grade data, the work grade data and the prize winning grade data;
and S53, dividing and combining the education grade data, the work grade data and the prize winning grade data according to a preset proportion to obtain the talent portrait of the interviewee.
6. A talent portrait construction system based on big data modeling is characterized by comprising a sample data processing module, a voice data processing module, a text data processing module and a portrait construction module;
the sample data processing module is used for acquiring sample talent data, performing talent weight refinement on the sample talent data to obtain a talent data set, and constructing a sample talent data model by using the talent data set;
the voice data processing module is used for acquiring voice data of interviewers and performing textualization processing on the voice data to obtain text data;
the text data processing module is used for screening the text data to obtain a screening data set, and matching the screening data set with the talent data set to obtain an image set;
the portrait construction module is used for constructing the talent portrait of the portrait set by utilizing the sample talent data model.
CN202010493764.4A 2020-06-03 2020-06-03 Talent portrait construction method and system based on big data modeling Pending CN111666377A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010493764.4A CN111666377A (en) 2020-06-03 2020-06-03 Talent portrait construction method and system based on big data modeling

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010493764.4A CN111666377A (en) 2020-06-03 2020-06-03 Talent portrait construction method and system based on big data modeling

Publications (1)

Publication Number Publication Date
CN111666377A true CN111666377A (en) 2020-09-15

Family

ID=72385648

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010493764.4A Pending CN111666377A (en) 2020-06-03 2020-06-03 Talent portrait construction method and system based on big data modeling

Country Status (1)

Country Link
CN (1) CN111666377A (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2973297A2 (en) * 2013-03-15 2016-01-20 Fem, Inc. Media content discovery and character organization techniques
US20160034585A1 (en) * 2014-08-01 2016-02-04 Yahoo!, Inc. Automatically generated comparison polls
CN106897402A (en) * 2017-02-13 2017-06-27 山大地纬软件股份有限公司 The method and user's portrait maker of user's portrait are built based on social security data
CN106920557A (en) * 2015-12-24 2017-07-04 中国电信股份有限公司 A kind of distribution method for recognizing sound-groove and device based on wavelet transformation
CN107229729A (en) * 2017-06-07 2017-10-03 北京幸福圈科技有限公司 A kind of efficiency economy business modular system based on artificial intelligence assistant
CN108062657A (en) * 2017-11-30 2018-05-22 朱学松 Method and system are interviewed in personnel recruitment
CN109726253A (en) * 2018-12-21 2019-05-07 义橙网络科技(上海)有限公司 Construction method, device, equipment and the medium of talent's map and talent's portrait
CN110414917A (en) * 2019-06-21 2019-11-05 东华大学 Recruitment recommended method based on talent's portrait
CN110473546A (en) * 2019-07-08 2019-11-19 华为技术有限公司 A kind of media file recommendation method and device

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2973297A2 (en) * 2013-03-15 2016-01-20 Fem, Inc. Media content discovery and character organization techniques
US20160034585A1 (en) * 2014-08-01 2016-02-04 Yahoo!, Inc. Automatically generated comparison polls
CN106920557A (en) * 2015-12-24 2017-07-04 中国电信股份有限公司 A kind of distribution method for recognizing sound-groove and device based on wavelet transformation
CN106897402A (en) * 2017-02-13 2017-06-27 山大地纬软件股份有限公司 The method and user's portrait maker of user's portrait are built based on social security data
CN107229729A (en) * 2017-06-07 2017-10-03 北京幸福圈科技有限公司 A kind of efficiency economy business modular system based on artificial intelligence assistant
CN108062657A (en) * 2017-11-30 2018-05-22 朱学松 Method and system are interviewed in personnel recruitment
CN109726253A (en) * 2018-12-21 2019-05-07 义橙网络科技(上海)有限公司 Construction method, device, equipment and the medium of talent's map and talent's portrait
CN110414917A (en) * 2019-06-21 2019-11-05 东华大学 Recruitment recommended method based on talent's portrait
CN110473546A (en) * 2019-07-08 2019-11-19 华为技术有限公司 A kind of media file recommendation method and device

Similar Documents

Publication Publication Date Title
CN105023573B (en) It is detected using speech syllable/vowel/phone boundary of auditory attention clue
Jiang et al. Analyzing chroma feature types for automated chord recognition
CN108899049A (en) A kind of speech-emotion recognition method and system based on convolutional neural networks
CN102231278A (en) Method and system for realizing automatic addition of punctuation marks in speech recognition
CN102723079B (en) Music and chord automatic identification method based on sparse representation
CN106295717B (en) A kind of western musical instrument classification method based on rarefaction representation and machine learning
CN103824565A (en) Humming music reading method and system based on music note and duration modeling
CN102592593B (en) Emotional-characteristic extraction method implemented through considering sparsity of multilinear group in speech
CN103440864A (en) Personality characteristic forecasting method based on voices
CN109360554A (en) A kind of language identification method based on language deep neural network
Noroozi et al. Supervised vocal-based emotion recognition using multiclass support vector machine, random forests, and adaboost
CN112289326B (en) Noise removal method using bird identification integrated management system with noise removal function
CN105810191A (en) Prosodic information-combined Chinese dialect identification method
Deb et al. Fourier model based features for analysis and classification of out-of-breath speech
CN116665669A (en) Voice interaction method and system based on artificial intelligence
Praksah et al. Analysis of emotion recognition system through speech signal using KNN, GMM & SVM classifier
CN114255783A (en) Method for constructing sound classification model, sound classification method and system
Tripathi et al. Improvement of phone recognition accuracy using speech mode classification
Chadha et al. Optimal feature extraction and selection techniques for speech processing: A review
Stamatatos¹ et al. Music performer recognition using an ensemble of simple classifiers
CN112052686B (en) Voice learning resource pushing method for user interactive education
Wu et al. A Characteristic of Speaker's Audio in the Model Space Based on Adaptive Frequency Scaling
CN110473548B (en) Classroom interaction network analysis method based on acoustic signals
Zwan et al. System for automatic singing voice recognition
CN111666377A (en) Talent portrait construction method and system based on big data modeling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200915