WO2023249345A1

WO2023249345A1 - Method and system for analyzing work experience data

Info

Publication number: WO2023249345A1
Application number: PCT/KR2023/008436
Authority: WO
Inventors: 전혜진
Original assignee: 주식회사 이지태스크
Priority date: 2022-06-20
Filing date: 2023-06-19
Publication date: 2023-12-28
Also published as: KR20230174179A

Abstract

One embodiment of the present invention provides a method for analyzing work experience data. The method comprises the steps of: collecting work experience data and performing pre-processing of the collected work experience data, the pre-processing including tokenization, stop word removal, and noun extraction; categorizing the pre-processed work experience data; and performing either positive or negative sentiment analysis on the work experience data.

Description

Career work data analysis methods and systems

The present invention relates to a career work data analysis method and system, and more specifically, to collect career work data, classify categories of the collected career work data according to preset standards, and sentiment analysis of the classified career work data. It relates to a method and system for determining whether the freelancer's evaluation is positive or negative.

Recently, in accordance with changing social phenomena such as the pandemic phenomenon and the introduction of flexible working hours, increasingly flexible forms of work life are in the spotlight.

In addition, job creation is still a major social issue, and companies continue to hire full-time employees for internal and external reasons, but the burden is difficult to resolve due to the enormous fixed and variable expenses that occur simultaneously with hiring.

Due to these circumstances, companies are reducing the number of regular employees and increasing the employment of freelancers, outsourcers, and contract workers. In particular, the professional freelance market is growing significantly.

However, the freelance job search process still requires cumbersome procedures, which consume a lot of time and money. For example, even though it is a relatively simple pastime, job postings, resume reviews, evaluations, and interviews are carried out.

To overcome this situation, online platforms that connect companies and freelancers are appearing in the market.

However, conventional online platforms have structural problems that intensify price competition among freelancers.

Specifically, existing platforms have a service structure in which freelancers directly present prices, resulting in excessive price competition relative to the time devoted to work. This continuous competition ultimately leads to low unit prices, and can form a vicious cycle in which opportunities to obtain work are limited as exposure frequency decreases.

In addition, the nature of the freelance market causes information asymmetry, and conventional online platform technology does not solve this problem.

For example, the freelance market has a variety of fields, and tasks are subdivided in each field, so even the same work can be interpreted as actually different work. As a result, freelance work experience is also not unified, so different situations arise for each work request. There is a high possibility that it will be staged. In other words, the gap in request costs for the same work varies greatly, and we are faced with a structure in which it is difficult to guarantee trust in the experience and abilities of freelancers.

Accordingly, there is a need for research on how to efficiently manage freelancer's career data and analyze the freelancer's career to determine whether the freelancer's evaluation is positive or negative.

The present disclosure is intended to solve the problems of the prior art described above, by collecting career work data, classifying the categories of the collected career work data according to preset standards, and performing sentiment analysis on the classified career work data. We aim to provide a method and system to determine whether a freelancer's evaluation is positive or negative.

The technical problems to be achieved by the present invention are not limited to the above-described technical problems, and other technical problems of the present invention can be derived from the following description.

As a technical means for solving the above-described technical problem, an embodiment according to the first aspect of the present disclosure provides a career work data analysis method. The method includes collecting career work data, performing a preprocessing process including tokenization, stopword removal, and noun extraction on the collected career work data, classifying categories of the preprocessed career work data, and It includes performing sentiment analysis on the career work data as either positive or negative.

Additionally, an embodiment according to the second aspect of the present disclosure provides a career work data analysis system. The system includes a communication module, at least one processor, and a memory electrically connected to the processor and storing at least one code to be executed by the processor, and the memory stores the code when executed through the processor. A processor collects career work data, performs a preprocessing process including tokenization, stopword removal, and noun extraction on the collected career work data, classifies categories of the preprocessed career work data, and classifies the career work data into categories. Stores code that causes sentiment analysis to be performed on data as either positive or negative.

According to the present invention, by collecting data related to career or career history from other external servers, it is possible to build a large amount of data on the freelancer's career work.

Additionally, according to the present invention, later processing and analysis of data can be facilitated by subdividing, classifying, and storing the collected career tasks on a systematic basis.

Additionally, according to the present invention, by collecting review data about freelancers from other external servers, it can be used as data to judge the freelancer's career.

Additionally, according to the present invention, highly reliable career verification can be achieved by determining whether the freelancer's evaluation is positive or negative based on the attributes of the collected review data.

In addition, according to the present invention, a method of collecting and efficiently managing a vast amount of career work data is disclosed, thereby laying the foundation for solving the problem of asymmetry of information related to the freelance market.

The effects of the present invention are not limited to the effects described above, and include all effects understood from the following description.

1 is a diagram illustrating a career work data analysis system according to an embodiment of the present invention.

FIG. 2 is a diagram showing the detailed configuration of the server shown in FIG. 1.

Figure 3 is a diagram illustrating the process of analyzing career work data.

Figure 4 is a diagram illustrating an example of classifying categories of career work data using the LDA technique.

Figure 5 is a diagram illustrating a career work data analysis system according to another embodiment of the present invention.

Figure 6 is a flowchart showing the sequence of a career work data analysis method according to another embodiment of the present invention.

Hereinafter, the present disclosure will be described in detail with reference to the attached drawings. However, the present disclosure may be implemented in various different forms and is not limited to the embodiments described herein. In addition, the attached drawings are only intended to facilitate understanding of the embodiments disclosed in this specification, and the technical idea disclosed in this specification is not limited by the attached drawings. All terms, including technical and scientific terms, used herein should be interpreted as meanings commonly understood by those skilled in the art in the technical field to which this disclosure pertains. Terms defined in the dictionary should be interpreted as having additional meanings consistent with the related technical literature and currently disclosed content, and should not be interpreted in a very ideal or limited sense unless otherwise defined.

In order to clearly explain the present disclosure in the drawings, parts not related to the description are omitted, and the size, shape, and shape of each component shown in the drawings may be modified in various ways. Throughout the specification, identical/similar parts are given identical/similar reference numerals.

The suffixes “module” and “part” for components used in the following description are given or used interchangeably only for the ease of preparing the specification, and do not have distinct meanings or roles in themselves. Additionally, in describing the embodiments disclosed in this specification, if it is determined that detailed descriptions of related known technologies may obscure the gist of the embodiments disclosed in this specification, the detailed descriptions are omitted.

Throughout the specification, when a part is said to be “connected (connected, contacted, or combined)” with another part, this means not only when it is “directly connected (connected, contacted, or combined),” but also when it has other members in between. It also includes cases where they are “indirectly connected (connected, contacted, or combined).” Additionally, when a part is said to "include (equip or provide)" a certain component, this does not exclude other components, unless specifically stated to the contrary, but rather "includes (provides or provides)" other components. It means you can.

Terms representing ordinal numbers, such as first, second, etc., used in this specification are used only for the purpose of distinguishing one component from another component and do not limit the order or relationship of the components. For example, the first component of the present disclosure may be named a second component, and similarly, the second component may also be named a first component. As used herein, singular forms of expression should be construed to also include plural forms of expression, unless the contrary is clearly indicated.

Referring to FIG. 1, the career work data analysis system is a device that collects and analyzes career work data, and may be implemented in the form of a server or terminal, for example. According to an embodiment of the present invention, the career work data analysis system can be included in a work integrated control system and built as part of optimized online work brokerage and career management.

The server 100 collects career work data and performs a pre-processing process including tokenization, stop word removal, and noun extraction on the collected career work data. For example, the server 100 may collect career work data from at least one of a communication-connected external server, an external database, and a user terminal. Here, career work data may include review data.

The server 100 classifies the categories of the preprocessed career work data. Here, the category may be composed of words from career work data that have the same subject range.

The server 100 performs sentiment analysis on career work data as either positive or negative.

The user terminal 200 may transmit career work data to the server 100 and receive a sentiment analysis result from the server 100.

The user terminal 200 may be connected to the server 100 through a communication network. The user terminal 200 is a laptop equipped with a web browser, a desktop, a laptop, a wireless communication device that guarantees portability and mobility, or any type of handheld device such as a smartphone, tablet PC, etc. It may refer to a handheld (Handheld)-based wireless communication device.

Referring to FIG. 2 , the server 100 may include a communication module 110, a processor 120, and a memory 130.

The communication module 110 may include a device including hardware and software necessary to transmit and receive signals such as control signals or data signals through wired or wireless connections with other network devices.

The communication module 110 may receive career work data from at least one of a user terminal, a communication-connected external server, and an external database. Additionally, the communication module 110 may transmit the results of emotional analysis of career work data to at least one of a user terminal, a communication-connected external server, and an external database. Here, the external database is a device that stores various data generated from a specific job search site on the web or application, and can be linked to the career work data analysis system and network to provide the career work data stored internally.

According to one embodiment, the external database may be included in an external server that controls various procedures of the job search site, and is preferably implemented as a cloud server to continuously provide data regardless of space.

The processor 120 may include various types of devices that control and process data. The processor 120 may refer to a data processing device built into hardware that has a physically structured circuit to perform functions expressed by codes or instructions included in a program.

In one example, the processor 120 may include a microprocessor, a central processing unit (CPU), a processor core, a multiprocessor, an application-specific integrated circuit (ASIC), or an FPGA ( It may be implemented in the form of a field programmable gate array, etc., but the scope of the present invention is not limited thereto.

The processor 120 performs operations according to the code stored in the memory 130.

The memory 130 stores at least one of information and data input to the communication module 110, information and data required for functions performed by the processor 120, and data generated according to execution of the processor 120. You can.

Memory 130 should be interpreted as a general term for non-volatile storage devices that continue to retain stored information even when power is not supplied and volatile storage devices that require power to maintain stored information. The memory 130 may include magnetic storage media or flash storage media in addition to volatile storage devices that require power to maintain stored information, but the scope of the present invention is not limited thereto. no.

The memory 130 is electrically connected to the processor 120 and stores at least one code executed by the processor 120. The memory 130 stores code that, when executed through the processor 120, causes the processor 120 to perform the following functions and procedures.

Memory 130 stores code that causes career work data to be collected. For example, career work data may be collected from at least one of a communication-connected external server, an external database, and a user terminal.

Here, career work data refers to the totality of various data that can prove the career or work history of freelancers, and it is sufficient if it is distributed online, and the formal/unstructured form or type does not limit the present invention. In the case of document data, examples include, but are not limited to, ID cards, bankbook copies, personal information consent forms, resumes, work confirmations, work settlement statements, expense payment confirmations, and work logs.

In the present invention, career work data includes experience (years/hours/period, etc.), workplace, position, task, role, program used, work field, detailed work history, participation rate, collaborator, task name, task performance goal, and purpose of use. And it may include review data evaluating the relevant experienced person, such as chat or reviews.

In the present invention, "freelancer" is not limited to a person who works on a free contract without any affiliation, but should be interpreted to include all individuals and organizations who are hired to perform a specific task, provide labor, and receive compensation. .

Memory 130 stores code that causes preprocessing, including tokenization, stopword removal, and noun extraction, to be performed on the collected career work data.

For example, based on the code stored in the memory 130, the sentence "The designs all came out so pretty..." includes "design", "this", "all", "too", "pretty", " It can be tokenized as “I came out”, “...”. Based on the code stored in the memory 130, stop words among "design", "this", "all", "too", "pretty", "came out", "..." were removed, "design", " Only “too” and “pretty” can be extracted. Based on the code stored in the memory 130, only the noun "design" among "design", "too", and "pretty" can be extracted.

The memory 130 stores a code that collects work requirements data and micro-task data from job requests from client companies listed on job search sites, and classifies the collected career work data based on occupation, task, and micro-task. It can be. For example, an occupation is a type of occupation or job, and a task is something that is objectively performed repeatedly a considerable number of times or is performed with the intention of continuing repetition, and a micro task is one that subdivides the task into smaller units and performs the work through piecework. It may be that it does some of the work.

Here, the job description area exposed on the job search site explains the job in detail. For example, if the field is gradually divided into categories such as office work - design work - Photoshop, the data for "office work" is classified as an occupation group, the data for "design" is classified as a work group, and the data for "Photoshop" is classified as a micro work group. It can be done, and hereinafter, the smallest unit of work, such as Photoshop, can be defined as "micro work."

The memory 130 may store a code that causes the preprocessed career work data to be classified into categories. For example, the memory 130 may store a code that causes category classification of career work data using an LDA (Latent Dirichlet Allocation) technique.

Specifically, the career work data analysis system determines categories (topics) for each pre-stored task, determines which category the nouns in each career work data extracted through preprocessing should be clustered, and classifies each noun into the determined category. You can.

Accordingly, each category may be composed only of words from career work data having the same subject range. For example, if the category is “translation,” words such as grammar, expression, and word usage may be included, and if the category is “PPT,” words such as design, layout, and unity may be included.

The LDA technique may consist of duplicate words within a category or may include words that are not related or have no meaning. Accordingly, a code that causes a reclassification process of career work data classified into categories may be stored in the memory 130. For example, if the word proportion is confirmed but no words related to design are included, it may be classified into a new category through a new judgment rather than being separated into the design category.

The memory 130 stores a code that learns words from career work data in a 200-dimensional skip-gram method using Word2Vec, a deep learning model, and constructs the vector value of each word as a vector space model. It can be. Additionally, a code that calculates the similarity between each word or between a word and a category based on Equation 1 and causes the data to be reclassified based on this may be stored in the memory 130. For example, words whose similarity is above a certain standard can be classified into words whose similarity is less than the certain standard.

Here, A is a word that is a reference value according to the matrix, B is an extracted word, and the similarity between words A and B can be calculated through Equation 1.

Review data is generally sentence- or paragraph-level data, so it was previously difficult to determine whether it was positive or negative. For example, even if the review data contains mostly positive feedback, if it ends with “but I will not work with this freelancer,” the review data should be determined as negative, but it would mostly be classified as positive.

In order to solve the above problem, the memory 130 of the present invention may store a code that determines whether the attribute is positive or negative for each classified career work data. For example, a code that causes the weight of the effectiveness of a positive word to be adjusted according to the customer's satisfaction rating and the degree of future rematching may be stored in the memory 130. For example, even if it is a positive response such as “Yes,” “It’s good,” or “Thank you for your hard work,” it is judged negatively if the individual rating is low or if matching is not done again, and the next matching will proceed or the rating will be lowered. If it is high, it can be judged positively.

The memory 130 may store a code that converts review data into attribute units and analyzes the pros and cons of each attribute. For example, the memory 130 may store a code that assigns attribute values corresponding to each word and specifies similar words for the corresponding word to form a data set capable of understanding the underlying meaning.

For example, as review data is collected, attributes continue to accumulate, so only meaningful attributes can be extracted through preset statistical analysis. Preferably, noun-level attribute identification can be performed on the entire review data.

Attributes classified from specific career work data may include core attributes, peripheral attributes, and communication attributes, and examples of each attribute may be as shown in Table 1.

구분division	내용detail
핵심 속성core attributes	결과물, 비용, 시간Output, Cost, Time
주변 속성surrounding properties	마이크로 업무micro tasks
커뮤니케이션 속성communication properties	의사소통Communication

The memory 130 stores a code that causes sentiment analysis to be performed on career work data as either positive or negative. For example, the memory 130 may store a code that causes sentiment analysis to be performed as either positive or negative on review data containing evaluations of experienced employees among the career work data. Code may be stored that performs a subjectivity detection technique to extract only those parts of the career work data in which a person's subjectivity appears. For example, the subjectivity detection technique may be a detection technique that classifies only the elements to be used for emotional analysis by removing parts of career work data that are not related to emotions and personal information.

The memory 130 may store a code that performs a polarity detection technique on the part of the career work data in which a person's subjectivity appears, and causes the part in which the person's subjectivity appears to be classified as one of positive and negative emotions. . For example, a polarity detection technique may be a detection technique that detects and quantifies positive and negative words in text and determines whether a sentence containing the text is positive or negative by applying weights representing positive and negative words.

The memory 130 may store a code that extracts verbs and adjectives from career work data and analyzes emotional words through Equation 2.

Here, x is a sentence containing an attribute, c may be a class that determines positivity or negativity, and if sentence ‘x’ is positive, it can have a value of +1, and if it is negative, it can have a value of -1.

TC-W2V means measuring the relatedness between words in a topic using Word2Vec, N means the top k words in the topic, and K means the total number of topics.

are two words calculated from Word2Vec

and

It can mean the similarity value between the two.

Figure 3 is a diagram illustrating the process of analyzing career work data.

Referring to FIG. 3, the career work data analysis system may include a data collection unit 310, a data pre-processing unit 320, and an LDA processing unit 330.

The data collection unit 310 may collect career work data from a user terminal, an external server connected to communication with the career work data analysis system, and an external database.

The data collection unit 310 may collect career work data listed on a job search site. For example, the data collection unit 310 may perform Internet crawling to retrieve career work data distributed externally.

Specifically, the process of searching for URLs related to job search services distributed on the network and finding, classifying, and storing other hyperlinks within the searched URLs is repeated. In this way, you can browse the web pages of job search sites on the Internet, find career data from each external database, create an index to identify where each data is located, and save it in the database within the career data analysis system. .

The data preprocessing unit 320 may perform a preprocessing process on the collected career work data.

For example, the data preprocessing unit 320 sequentially performs tokenization, stopword removal, and noun extraction on career work data to determine what meaning or task the career work data typically represents. Standardization can be performed. Table 2 below shows an example of the preprocessing process for specific review data.

데이터data	리뷰review
원본original	디자인이 모두 너무 예쁘게 나왔어요...All of the designs came out so pretty...
토큰화Tokenization	디자인, 이, 모두, 너무, 예쁘게, 나왔어요, ...The design, everything, came out so, pretty, ...
불용어 제거Remove stop words	디자인, 너무, 예쁘게Design, so, pretty
명사추출noun extraction	디자인design

The LDA processing unit 330 may classify the categories of career work data preprocessed through the data preprocessing unit 320. For example, the LDA processing unit 330 may perform category classification of career work data using the Latent Dirichlet Allocation (LDA) technique. For example, the LDA processing unit 330 may analyze career work data based on the LDA technique and display the capabilities of each of at least one category according to the analysis results in the form of a graph.

Referring to Figure 4, the career work data analysis system can perform categorization of career work data using the LDA (Latent Dirichlet Allocation) technique. The LDA technique is the most representative algorithm for topic modeling. By analyzing a large amount of document data through a probability-based modeling technique, it is possible to analyze which topics are composed in a document and at what ratio.

Specifically, the career work data analysis system can predetermine the corresponding topic value for each noun related to work and analyze which of the various categories it fits into by analyzing newly occurring words while work is in progress.

Categories can only consist of words from career work data, each with the same subject scope. For example, if the category is “translation,” it may consist of words such as grammar, expression, word usage, spelling, speed, writing power, and delivery, and if the category is “PPT,” it may consist of words such as design, layout, and unity. It can be composed of:

The career work data analysis system can set categories based on collected review data or set categories based on preset job names. For example, the career work data analysis system can initially set words such as color, layout, and ratio as categories of design work, and classify them into design categories when data called layout comes in based on the values once set.

By analyzing the designer's career work data, the abilities of each category such as figma, Photoshop, color, layout, and illustration can be displayed in graph form. Using the same principle, by analyzing the translator's career work data, the ability values of each category such as expression, spelling, speed, writing ability, and delivery ability can be displayed in the form of a graph. By analyzing the marketer's career work data, the abilities of each category, such as Google Console usage, GA usage, Instagram follower recruitment ability, Facebook usage ability, and writing skills, can be displayed in graph form.

Referring to Figure 5, the career work data analysis system may include a matching management system and a career management system.

The matching management system can handle the overall process for online business matching services. In particular, it is possible to check the mutual needs of the client and performer in real time, perform matching based on satisfaction and feedback, and continuously secure career data for micro tasks.

The matching management system can manage at least one of customer management, task management, information data management, competency management, administrator management, and purchase/cost management.

The career management system is comprised of a career work data analysis system according to an embodiment of the present invention, and can continuously secure additional data related to career work by contacting various external data, including external databases. Accordingly, standardized standards for freelancers' experience, history, and wage prices can be presented.

The career management system can manage at least one of the tasks and career management of each worker and performer, information data management, career analysis management, document and certification management, micro tasks, and freelance career management.

In the matching management system, task matching is carried out between customers and task performers, and feedback can occur according to the tasks performed. Customer satisfaction and feedback generated here can be used as information data under the career management system to analyze careers (competencies) and provide them to customers.

The career work data analysis method to be described below may be performed by the career work data analysis system and server previously described with reference to FIGS. 1 to 5. Accordingly, the contents of the embodiments of the present disclosure previously described with reference to FIGS. 1 to 5 can be equally applied to the embodiments to be described below, and contents that overlap with the description described above will be omitted below. The steps described below do not necessarily have to be performed in order, the order of the steps may be set in various ways, and the steps may be performed almost simultaneously.

Referring to FIG. 6, the career work data analysis method includes a preprocessing step of career work data (S100), a categorization step of career work data (S200), and a sentiment analysis step of career work data (S300).

The preprocessing step (S100) of career work data is a step in which career work data is collected and a preprocessing process including tokenization, stopword removal, and noun extraction is performed on the collected career work data. For example, in the pre-processing step (S100) of career work data, career work data including review data including an evaluation of a career employee may be collected from at least one of a communication-connected external server, an external database, and a user terminal.

The category classification step (S200) of career work data is a step of classifying the categories of preprocessed career work data. For example, in the category classification step (S200) of career work data, the career work data may be analyzed based on the LDA technique, and the career work data categories may be classified according to the analysis results. Categories may be composed of words from career work data that have the same subject range.

The sentiment analysis step (S300) of career work data is a step of performing sentiment analysis on career work data into either positive or negative. For example, in the sentiment analysis step (S300) of career work data, a subjectivity detection technique is performed on the career work data to extract only the portions of the career work data in which human subjectivity appears, and By performing a polarity detection technique, the part where a person's subjectivity appears can be classified into one of positive and negative emotions.

Here, the subjectivity detection technique is a detection technique that classifies only the elements to be used for emotion analysis by removing parts of career work data that are not related to emotions and personal information, and the polarity detection technique detects and quantifies positive and negative words in the text. It may be a detection technique that indicates the state of the sentence by applying statistical techniques to provide a weight for whether the word is positive or negative.

Those skilled in the art to which this disclosure pertains will be able to understand, based on the above description, that the present disclosure can be easily modified into another specific form without changing its technical idea or essential features. Therefore, the embodiments described above should be understood in all respects as illustrative and not restrictive. The scope of the present disclosure is indicated by the patent claims described later, and all changes or modified forms derived from the meaning and scope of the claims and their equivalent concepts should be construed as being included in the scope of the present disclosure. The scope of the present application is indicated by the claims described below rather than the detailed description above, and all changes or modified forms derived from the meaning and scope of the claims and their equivalent concepts should be construed as being included in the scope of the present application.

The form for carrying out the invention is as described in the best form for carrying out the invention described above.

The present invention has industrial applicability because it can be used as a career work data analysis technology to automatically analyze requested career work data and determine whether the evaluation of the career worker is positive or negative.

Claims

In a career work data analysis method performed by a career work data analysis system,

a) collecting career work data and performing a pre-processing process including tokenization, stopword removal, and noun extraction on the collected career work data;

b) classifying categories of the pre-processed career work data; and

c) A method of analyzing career work data, comprising the step of performing sentiment analysis on the career work data as either positive or negative.
According to paragraph 1,

In step a), the career work data is collected from at least one of an external server, an external database, and a user terminal connected to communication with the career work data analysis system,

A method of analyzing career work data, wherein the career work data includes review data about job performers.
According to paragraph 1,

In step b) above,

A career work data analysis method that analyzes the career work data based on the LDA (Latent Dirichlet Allocation) technique and displays the ability values of each of at least one or more categories according to the analysis results in the form of a graph.
According to paragraph 1,

In step b),

Further comprising performing a reclassification process of career work data classified into the above categories,

The reclassification process calculates the similarity between each word or between a word and a category, and classifies the career work data based on the similarity.
According to paragraph 1,

In step c),

c-1) performing a subjectivity detection technique on the career work data to extract only the portion in which a person's subjectivity appears among the career work data; and

c-2) a step of performing a polarity detection technique on the part where a person's subjectivity appears among the career work data, and classifying the part where the person's subjectivity appears as one of positive and negative emotions, career work. Data analysis methods.
According to clause 5,

The subjectivity detection technique is a detection technique that classifies only the elements to be used for emotional analysis by removing parts of the career work data that are not related to emotion and personal information,

The polarity detection technique is a detection technique that detects and quantifies positive and negative words in a text and applies weights representing the positive and negative words to determine whether a sentence containing the text is positive or negative, a career work data analysis method. .
communication module;

at least one processor; and

A memory electrically connected to the processor and storing at least one code executed by the processor,

When the memory is executed through the processor, the processor

Collect career work data, perform a preprocessing process including tokenization, stopword removal, and noun extraction on the collected career work data, classify the categories of the preprocessed career work data, and A career work data analysis system that stores code that causes sentiment analysis to be performed as either positive or negative.
In clause 7,

The memory allows the processor to:

storing code that causes the career work data to be collected from at least one of an external server, an external database, and a user terminal that are in communication with the career work data analysis system;

A career work data analysis system wherein the career work data includes review data about job performers.
In clause 7,

The memory allows the processor to:

A career work data analysis system that analyzes the career work data based on the LDA (Latent Dirichlet Allocation) technique and displays the ability values of each of at least one or more categories according to the analysis results in the form of a graph.
In clause 7,

The memory allows the processor to:

storing a code that causes a reclassification process of career work data classified into the above categories to be performed;

The reclassification process calculates the similarity between each word or between a word and a category, and classifies the career work data based on the similarity.
In clause 7,

The memory allows the processor to:

By performing a subjectivity detection technique on the career work data, extracting only the portion in which the person's subjectivity appears among the career work data, and performing a polarity detection technique on the portion in which the person's subjectivity appears in the career work data, the person A career work data analysis system that stores codes that cause the subjective feelings to be classified as either positive or negative.
According to clause 11,

The subjectivity detection technique is a detection technique that classifies only the elements to be used for emotional analysis by removing parts of the career work data that are not related to emotion and personal information,

The polarity detection technique is a detection technique that detects and quantifies positive and negative words in a text and applies weights representing the positive and negative words to determine whether a sentence containing the text is positive or negative, a career work data analysis system. .