WO2023249345A1 - Method and system for analyzing work experience data - Google Patents

Method and system for analyzing work experience data Download PDF

Info

Publication number
WO2023249345A1
WO2023249345A1 PCT/KR2023/008436 KR2023008436W WO2023249345A1 WO 2023249345 A1 WO2023249345 A1 WO 2023249345A1 KR 2023008436 W KR2023008436 W KR 2023008436W WO 2023249345 A1 WO2023249345 A1 WO 2023249345A1
Authority
WO
WIPO (PCT)
Prior art keywords
work data
career
career work
positive
detection technique
Prior art date
Application number
PCT/KR2023/008436
Other languages
French (fr)
Korean (ko)
Inventor
전혜진
Original Assignee
주식회사 이지태스크
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 주식회사 이지태스크 filed Critical 주식회사 이지태스크
Publication of WO2023249345A1 publication Critical patent/WO2023249345A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/105Human resources
    • G06Q10/1053Employment or hiring
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3335Syntactic pre-processing, e.g. stopword elimination, stemming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/40Data acquisition and logging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management

Definitions

  • the present invention relates to a career work data analysis method and system, and more specifically, to collect career work data, classify categories of the collected career work data according to preset standards, and sentiment analysis of the classified career work data. It relates to a method and system for determining whether the freelancer's evaluation is positive or negative.
  • the freelance market has a variety of fields, and tasks are subdivided in each field, so even the same work can be interpreted as actually different work.
  • freelance work experience is also not unified, so different situations arise for each work request. There is a high possibility that it will be staged.
  • the gap in request costs for the same work varies greatly, and we are faced with a structure in which it is difficult to guarantee trust in the experience and abilities of freelancers.
  • the present disclosure is intended to solve the problems of the prior art described above, by collecting career work data, classifying the categories of the collected career work data according to preset standards, and performing sentiment analysis on the classified career work data.
  • an embodiment according to the first aspect of the present disclosure provides a career work data analysis method.
  • the method includes collecting career work data, performing a preprocessing process including tokenization, stopword removal, and noun extraction on the collected career work data, classifying categories of the preprocessed career work data, and It includes performing sentiment analysis on the career work data as either positive or negative.
  • an embodiment according to the second aspect of the present disclosure provides a career work data analysis system.
  • the system includes a communication module, at least one processor, and a memory electrically connected to the processor and storing at least one code to be executed by the processor, and the memory stores the code when executed through the processor.
  • a processor collects career work data, performs a preprocessing process including tokenization, stopword removal, and noun extraction on the collected career work data, classifies categories of the preprocessed career work data, and classifies the career work data into categories.
  • later processing and analysis of data can be facilitated by subdividing, classifying, and storing the collected career tasks on a systematic basis.
  • highly reliable career verification can be achieved by determining whether the freelancer's evaluation is positive or negative based on the attributes of the collected review data.
  • a method of collecting and efficiently managing a vast amount of career work data is disclosed, thereby laying the foundation for solving the problem of asymmetry of information related to the freelance market.
  • FIG. 1 is a diagram illustrating a career work data analysis system according to an embodiment of the present invention.
  • FIG. 2 is a diagram showing the detailed configuration of the server shown in FIG. 1.
  • Figure 3 is a diagram illustrating the process of analyzing career work data.
  • Figure 4 is a diagram illustrating an example of classifying categories of career work data using the LDA technique.
  • Figure 5 is a diagram illustrating a career work data analysis system according to another embodiment of the present invention.
  • Figure 6 is a flowchart showing the sequence of a career work data analysis method according to another embodiment of the present invention.
  • first, second, etc. used in this specification are used only for the purpose of distinguishing one component from another component and do not limit the order or relationship of the components.
  • first component of the present disclosure may be named a second component, and similarly, the second component may also be named a first component.
  • singular forms of expression should be construed to also include plural forms of expression, unless the contrary is clearly indicated.
  • FIG. 1 is a diagram illustrating a career work data analysis system according to an embodiment of the present invention.
  • the career work data analysis system is a device that collects and analyzes career work data, and may be implemented in the form of a server or terminal, for example.
  • the career work data analysis system can be included in a work integrated control system and built as part of optimized online work brokerage and career management.
  • the server 100 collects career work data and performs a pre-processing process including tokenization, stop word removal, and noun extraction on the collected career work data.
  • the server 100 may collect career work data from at least one of a communication-connected external server, an external database, and a user terminal.
  • career work data may include review data.
  • the server 100 classifies the categories of the preprocessed career work data.
  • the category may be composed of words from career work data that have the same subject range.
  • the server 100 performs sentiment analysis on career work data as either positive or negative.
  • the user terminal 200 may transmit career work data to the server 100 and receive a sentiment analysis result from the server 100.
  • the user terminal 200 may be connected to the server 100 through a communication network.
  • the user terminal 200 is a laptop equipped with a web browser, a desktop, a laptop, a wireless communication device that guarantees portability and mobility, or any type of handheld device such as a smartphone, tablet PC, etc. It may refer to a handheld (Handheld)-based wireless communication device.
  • FIG. 2 is a diagram showing the detailed configuration of the server shown in FIG. 1.
  • the server 100 may include a communication module 110, a processor 120, and a memory 130.
  • the communication module 110 may include a device including hardware and software necessary to transmit and receive signals such as control signals or data signals through wired or wireless connections with other network devices.
  • the communication module 110 may receive career work data from at least one of a user terminal, a communication-connected external server, and an external database. Additionally, the communication module 110 may transmit the results of emotional analysis of career work data to at least one of a user terminal, a communication-connected external server, and an external database.
  • the external database is a device that stores various data generated from a specific job search site on the web or application, and can be linked to the career work data analysis system and network to provide the career work data stored internally.
  • the external database may be included in an external server that controls various procedures of the job search site, and is preferably implemented as a cloud server to continuously provide data regardless of space.
  • the processor 120 may include various types of devices that control and process data.
  • the processor 120 may refer to a data processing device built into hardware that has a physically structured circuit to perform functions expressed by codes or instructions included in a program.
  • the processor 120 may include a microprocessor, a central processing unit (CPU), a processor core, a multiprocessor, an application-specific integrated circuit (ASIC), or an FPGA ( It may be implemented in the form of a field programmable gate array, etc., but the scope of the present invention is not limited thereto.
  • the processor 120 performs operations according to the code stored in the memory 130.
  • the memory 130 stores at least one of information and data input to the communication module 110, information and data required for functions performed by the processor 120, and data generated according to execution of the processor 120. You can.
  • Memory 130 should be interpreted as a general term for non-volatile storage devices that continue to retain stored information even when power is not supplied and volatile storage devices that require power to maintain stored information.
  • the memory 130 may include magnetic storage media or flash storage media in addition to volatile storage devices that require power to maintain stored information, but the scope of the present invention is not limited thereto. no.
  • the memory 130 is electrically connected to the processor 120 and stores at least one code executed by the processor 120.
  • the memory 130 stores code that, when executed through the processor 120, causes the processor 120 to perform the following functions and procedures.
  • Memory 130 stores code that causes career work data to be collected.
  • career work data may be collected from at least one of a communication-connected external server, an external database, and a user terminal.
  • career work data refers to the totality of various data that can prove the career or work history of freelancers, and it is sufficient if it is distributed online, and the formal/unstructured form or type does not limit the present invention.
  • examples include, but are not limited to, ID cards, bankbook copies, personal information consent forms, resumes, work confirmations, work settlement statements, expense payment confirmations, and work logs.
  • career work data includes experience (years/hours/period, etc.), workplace, position, task, role, program used, work field, detailed work history, participation rate, collaborator, task name, task performance goal, and purpose of use. And it may include review data evaluating the relevant experienced person, such as chat or reviews.
  • freelancer is not limited to a person who works on a free contract without any affiliation, but should be interpreted to include all individuals and organizations who are hired to perform a specific task, provide labor, and receive compensation. .
  • Memory 130 stores code that causes preprocessing, including tokenization, stopword removal, and noun extraction, to be performed on the collected career work data.
  • the memory 130 stores a code that collects work requirements data and micro-task data from job requests from client companies listed on job search sites, and classifies the collected career work data based on occupation, task, and micro-task. It can be.
  • occupation is a type of occupation or job
  • a task is something that is objectively performed repeatedly a considerable number of times or is performed with the intention of continuing repetition
  • a micro task is one that subdivides the task into smaller units and performs the work through piecework. It may be that it does some of the work.
  • the job description area exposed on the job search site explains the job in detail.
  • the field is gradually divided into categories such as office work - design work - Photoshop, the data for "office work” is classified as an occupation group, the data for "design” is classified as a work group, and the data for "Photoshop” is classified as a micro work group. It can be done, and hereinafter, the smallest unit of work, such as Photoshop, can be defined as "micro work.”
  • the memory 130 may store a code that causes the preprocessed career work data to be classified into categories.
  • the memory 130 may store a code that causes category classification of career work data using an LDA (Latent Dirichlet Allocation) technique.
  • the career work data analysis system determines categories (topics) for each pre-stored task, determines which category the nouns in each career work data extracted through preprocessing should be clustered, and classifies each noun into the determined category. You can.
  • each category may be composed only of words from career work data having the same subject range. For example, if the category is “translation,” words such as grammar, expression, and word usage may be included, and if the category is “PPT,” words such as design, layout, and unity may be included.
  • the LDA technique may consist of duplicate words within a category or may include words that are not related or have no meaning. Accordingly, a code that causes a reclassification process of career work data classified into categories may be stored in the memory 130. For example, if the word proportion is confirmed but no words related to design are included, it may be classified into a new category through a new judgment rather than being separated into the design category.
  • the memory 130 stores a code that learns words from career work data in a 200-dimensional skip-gram method using Word2Vec, a deep learning model, and constructs the vector value of each word as a vector space model. It can be. Additionally, a code that calculates the similarity between each word or between a word and a category based on Equation 1 and causes the data to be reclassified based on this may be stored in the memory 130. For example, words whose similarity is above a certain standard can be classified into words whose similarity is less than the certain standard.
  • A is a word that is a reference value according to the matrix
  • B is an extracted word
  • the similarity between words A and B can be calculated through Equation 1.
  • Review data is generally sentence- or paragraph-level data, so it was previously difficult to determine whether it was positive or negative. For example, even if the review data contains mostly positive feedback, if it ends with “but I will not work with this freelancer,” the review data should be determined as negative, but it would mostly be classified as positive.
  • the memory 130 of the present invention may store a code that determines whether the attribute is positive or negative for each classified career work data. For example, a code that causes the weight of the effectiveness of a positive word to be adjusted according to the customer's satisfaction rating and the degree of future rematching may be stored in the memory 130. For example, even if it is a positive response such as “Yes,” “It’s good,” or “Thank you for your hard work,” it is judged negatively if the individual rating is low or if matching is not done again, and the next matching will proceed or the rating will be lowered. If it is high, it can be judged positively.
  • the memory 130 may store a code that converts review data into attribute units and analyzes the pros and cons of each attribute. For example, the memory 130 may store a code that assigns attribute values corresponding to each word and specifies similar words for the corresponding word to form a data set capable of understanding the underlying meaning.
  • noun-level attribute identification can be performed on the entire review data.
  • Attributes classified from specific career work data may include core attributes, peripheral attributes, and communication attributes, and examples of each attribute may be as shown in Table 1.
  • the memory 130 stores a code that causes sentiment analysis to be performed on career work data as either positive or negative.
  • the memory 130 may store a code that causes sentiment analysis to be performed as either positive or negative on review data containing evaluations of experienced employees among the career work data.
  • Code may be stored that performs a subjectivity detection technique to extract only those parts of the career work data in which a person's subjectivity appears.
  • the subjectivity detection technique may be a detection technique that classifies only the elements to be used for emotional analysis by removing parts of career work data that are not related to emotions and personal information.
  • the memory 130 may store a code that performs a polarity detection technique on the part of the career work data in which a person's subjectivity appears, and causes the part in which the person's subjectivity appears to be classified as one of positive and negative emotions.
  • a polarity detection technique may be a detection technique that detects and quantifies positive and negative words in text and determines whether a sentence containing the text is positive or negative by applying weights representing positive and negative words.
  • the memory 130 may store a code that extracts verbs and adjectives from career work data and analyzes emotional words through Equation 2.
  • x is a sentence containing an attribute
  • c may be a class that determines positivity or negativity, and if sentence ‘x’ is positive, it can have a value of +1, and if it is negative, it can have a value of -1.
  • TC-W2V means measuring the relatedness between words in a topic using Word2Vec
  • N means the top k words in the topic
  • K means the total number of topics.
  • Figure 3 is a diagram illustrating the process of analyzing career work data.
  • the career work data analysis system may include a data collection unit 310, a data pre-processing unit 320, and an LDA processing unit 330.
  • the data collection unit 310 may collect career work data from a user terminal, an external server connected to communication with the career work data analysis system, and an external database.
  • the data collection unit 310 may collect career work data listed on a job search site. For example, the data collection unit 310 may perform Internet crawling to retrieve career work data distributed externally.
  • the data preprocessing unit 320 may perform a preprocessing process on the collected career work data.
  • the data preprocessing unit 320 sequentially performs tokenization, stopword removal, and noun extraction on career work data to determine what meaning or task the career work data typically represents. Standardization can be performed. Table 2 below shows an example of the preprocessing process for specific review data.
  • the LDA processing unit 330 may classify the categories of career work data preprocessed through the data preprocessing unit 320. For example, the LDA processing unit 330 may perform category classification of career work data using the Latent Dirichlet Allocation (LDA) technique. For example, the LDA processing unit 330 may analyze career work data based on the LDA technique and display the capabilities of each of at least one category according to the analysis results in the form of a graph.
  • LDA Latent Dirichlet Allocation
  • Figure 4 is a diagram illustrating an example of classifying categories of career work data using the LDA technique.
  • the career work data analysis system can perform categorization of career work data using the LDA (Latent Dirichlet Allocation) technique.
  • LDA Topic Dirichlet Allocation
  • the LDA technique is the most representative algorithm for topic modeling. By analyzing a large amount of document data through a probability-based modeling technique, it is possible to analyze which topics are composed in a document and at what ratio.
  • the career work data analysis system can predetermine the corresponding topic value for each noun related to work and analyze which of the various categories it fits into by analyzing newly occurring words while work is in progress.
  • Categories can only consist of words from career work data, each with the same subject scope. For example, if the category is “translation,” it may consist of words such as grammar, expression, word usage, spelling, speed, writing power, and delivery, and if the category is “PPT,” it may consist of words such as design, layout, and unity. It can be composed of:
  • the career work data analysis system can set categories based on collected review data or set categories based on preset job names. For example, the career work data analysis system can initially set words such as color, layout, and ratio as categories of design work, and classify them into design categories when data called layout comes in based on the values once set.
  • the abilities of each category such as figma, Photoshop, color, layout, and illustration can be displayed in graph form.
  • the ability values of each category such as expression, spelling, speed, writing ability, and delivery ability can be displayed in the form of a graph.
  • the abilities of each category such as Google Console usage, GA usage, Instagram follower recruitment ability, Facebook usage ability, and writing skills, can be displayed in graph form.
  • Figure 5 is a diagram illustrating a career work data analysis system according to another embodiment of the present invention.
  • the career work data analysis system may include a matching management system and a career management system.
  • the matching management system can handle the overall process for online business matching services. In particular, it is possible to check the mutual needs of the client and performer in real time, perform matching based on satisfaction and feedback, and continuously secure career data for micro tasks.
  • the matching management system can manage at least one of customer management, task management, information data management, competency management, administrator management, and purchase/cost management.
  • the career management system is comprised of a career work data analysis system according to an embodiment of the present invention, and can continuously secure additional data related to career work by contacting various external data, including external databases. Accordingly, standardized standards for freelancers' experience, history, and wage prices can be presented.
  • the career management system can manage at least one of the tasks and career management of each worker and performer, information data management, career analysis management, document and certification management, micro tasks, and freelance career management.
  • task matching is carried out between customers and task performers, and feedback can occur according to the tasks performed.
  • Customer satisfaction and feedback generated here can be used as information data under the career management system to analyze careers (competencies) and provide them to customers.
  • Figure 6 is a flowchart showing the sequence of a career work data analysis method according to another embodiment of the present invention.
  • the career work data analysis method to be described below may be performed by the career work data analysis system and server previously described with reference to FIGS. 1 to 5. Accordingly, the contents of the embodiments of the present disclosure previously described with reference to FIGS. 1 to 5 can be equally applied to the embodiments to be described below, and contents that overlap with the description described above will be omitted below.
  • the steps described below do not necessarily have to be performed in order, the order of the steps may be set in various ways, and the steps may be performed almost simultaneously.
  • the career work data analysis method includes a preprocessing step of career work data (S100), a categorization step of career work data (S200), and a sentiment analysis step of career work data (S300).
  • the preprocessing step (S100) of career work data is a step in which career work data is collected and a preprocessing process including tokenization, stopword removal, and noun extraction is performed on the collected career work data.
  • career work data including review data including an evaluation of a career employee may be collected from at least one of a communication-connected external server, an external database, and a user terminal.
  • the category classification step (S200) of career work data is a step of classifying the categories of preprocessed career work data.
  • the career work data may be analyzed based on the LDA technique, and the career work data categories may be classified according to the analysis results. Categories may be composed of words from career work data that have the same subject range.
  • the sentiment analysis step (S300) of career work data is a step of performing sentiment analysis on career work data into either positive or negative.
  • a subjectivity detection technique is performed on the career work data to extract only the portions of the career work data in which human subjectivity appears, and By performing a polarity detection technique, the part where a person's subjectivity appears can be classified into one of positive and negative emotions.
  • the subjectivity detection technique is a detection technique that classifies only the elements to be used for emotion analysis by removing parts of career work data that are not related to emotions and personal information, and the polarity detection technique detects and quantifies positive and negative words in the text. It may be a detection technique that indicates the state of the sentence by applying statistical techniques to provide a weight for whether the word is positive or negative.
  • the present invention has industrial applicability because it can be used as a career work data analysis technology to automatically analyze requested career work data and determine whether the evaluation of the career worker is positive or negative.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Computational Linguistics (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • Game Theory and Decision Science (AREA)
  • Computer Hardware Design (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

One embodiment of the present invention provides a method for analyzing work experience data. The method comprises the steps of: collecting work experience data and performing pre-processing of the collected work experience data, the pre-processing including tokenization, stop word removal, and noun extraction; categorizing the pre-processed work experience data; and performing either positive or negative sentiment analysis on the work experience data.

Description

경력 업무 데이터 분석 방법 및 시스템Career work data analysis methods and systems
본 발명은 경력 업무 데이터 분석 방법 및 시스템에 관한 것으로, 보다 상세하게는, 경력 업무 데이터를 수집하고, 수집된 경력 업무 데이터의 카테고리를 기설정된 기준에 따라 분류하고, 분류된 경력 업무 데이터의 감성 분석을 수행하여 해당 프리랜서의 평가가 긍정인지 부정인지 판단하는 방법 및 시스템에 관한 것이다.The present invention relates to a career work data analysis method and system, and more specifically, to collect career work data, classify categories of the collected career work data according to preset standards, and sentiment analysis of the classified career work data. It relates to a method and system for determining whether the freelancer's evaluation is positive or negative.
최근 팬데믹(pandemic) 현상, 탄력 근무제 도입 등 변화하는 사회 현상에 따라 점차 유동적인 형태의 워크 라이프가 조명을 받고 있다.Recently, in accordance with changing social phenomena such as the pandemic phenomenon and the introduction of flexible working hours, increasingly flexible forms of work life are in the spotlight.
또한, 고용 창출 문제는 여전히 사회적인 큰 이슈이며, 기업들은 내적, 외적인 이유로 정규직 채용을 지속적으로 유지하고 있으나 채용과 동시에 발생하는 막대한 고정 및 변동 지출로 그 부담은 해결되기 어려운 상황이다.In addition, job creation is still a major social issue, and companies continue to hire full-time employees for internal and external reasons, but the burden is difficult to resolve due to the enormous fixed and variable expenses that occur simultaneously with hiring.
이러한 제반 사정으로, 기업은 정규직 인원을 줄이고 프리랜서, 외주 및 계약직 고용을 늘리는 추세이며, 특히 전문 프리랜서 시장이 크게 성장하고 있다.Due to these circumstances, companies are reducing the number of regular employees and increasing the employment of freelancers, outsourcers, and contract workers. In particular, the professional freelance market is growing significantly.
그러나, 프리랜서의 구인구직 과정에 있어 아직까지 번거로운 절차가 요구되어 많은 시간과 비용이 소비되고 있다. 예를 들어, 비교적 간단한 소일거리임에도 불구하고 구인공고, 이력서 검토, 평가 및 면접 등이 이루어지는 실정이다.However, the freelance job search process still requires cumbersome procedures, which consume a lot of time and money. For example, even though it is a relatively simple pastime, job postings, resume reviews, evaluations, and interviews are carried out.
이러한 상황을 타개하고자, 기업과 프리랜서를 연계해주는 온라인 플랫폼이 시장에 등장하고 있다.To overcome this situation, online platforms that connect companies and freelancers are appearing in the market.
다만, 종래의 온라인 플랫폼은 프리랜서의 가격 경쟁이 심화되는 구조적 문제점이 존재한다.However, conventional online platforms have structural problems that intensify price competition among freelancers.
구체적으로, 기존 플랫폼은 프리랜서가 직접 가격을 제시하는 서비스 구조를 가지며, 이에 따라 업무 할애 시간 대비 과도한 가격 경쟁이 발생하고 있다. 이러한 지속적인 경쟁은 결국 낮은 단가를 형성하게 되며, 노출 빈도가 적어지면서 일거리를 얻을 기회가 제한되는 악순환 구조를 형성할 수 있다.Specifically, existing platforms have a service structure in which freelancers directly present prices, resulting in excessive price competition relative to the time devoted to work. This continuous competition ultimately leads to low unit prices, and can form a vicious cycle in which opportunities to obtain work are limited as exposure frequency decreases.
또한, 프리랜서 시장의 특성상 정보의 비대칭성이 유발되며, 종래의 온라인 플랫폼 기술은 이러한 문제점을 해결하지 못하고 있다.In addition, the nature of the freelance market causes information asymmetry, and conventional online platform technology does not solve this problem.
예를 들어, 프리랜서 시장은 분야가 다양하고, 각 분야 별로 업무가 세분화되어 있어 동일한 업무라 해도 실제 다른 업무로 해석될 여지가 많으며, 이로써 프리랜서 업무 경력 또한 통일되지 못하여, 업무 의뢰 별 각기 다른 상황이 연출될 가능성이 높다. 즉, 동일 업무에 대한 의뢰비용의 갭이 천차만별로 형성되고, 프리랜서의 경력 및 능력의 신뢰가 보장되기 어려운 구조에 직면하고 있다.For example, the freelance market has a variety of fields, and tasks are subdivided in each field, so even the same work can be interpreted as actually different work. As a result, freelance work experience is also not unified, so different situations arise for each work request. There is a high possibility that it will be staged. In other words, the gap in request costs for the same work varies greatly, and we are faced with a structure in which it is difficult to guarantee trust in the experience and abilities of freelancers.
이에 따라, 프리랜서의 경력 업무 데이터를 효율적으로 관리하고, 프리랜서의 경력을 분석하여 해당 프리랜서의 평가가 긍정인지 부정인지 판단할 수 있는 방법에 대한 연구가 필요한 실정이다.Accordingly, there is a need for research on how to efficiently manage freelancer's career data and analyze the freelancer's career to determine whether the freelancer's evaluation is positive or negative.
본 개시는 전술한 종래 기술의 문제점을 해결하기 위한 것으로, 경력 업무 데이터를 수집하고, 수집된 경력 업무 데이터의 카테고리를 기설정된 기준에 따라 분류하고, 분류된 경력 업무 데이터의 감성 분석을 수행하여 해당 프리랜서의 평가가 긍정인지 부정인지 판단하는 방법 및 시스템을 제공하고자 한다.The present disclosure is intended to solve the problems of the prior art described above, by collecting career work data, classifying the categories of the collected career work data according to preset standards, and performing sentiment analysis on the classified career work data. We aim to provide a method and system to determine whether a freelancer's evaluation is positive or negative.
본 발명이 이루고자 하는 기술적 과제들은 상기한 기술적 과제로 제한되지 않으며, 이하의 설명으로부터 본 발명의 또 다른 기술적 과제들이 도출될 수 있다.The technical problems to be achieved by the present invention are not limited to the above-described technical problems, and other technical problems of the present invention can be derived from the following description.
상술한 기술적 과제를 해결하기 위한 기술적 수단으로서, 본 개시의 제1 측면에 따른 실시예는, 경력 업무 데이터 분석 방법을 제공한다. 본 방법은, 경력 업무 데이터를 수집하고, 수집된 상기 경력 업무 데이터에 대해 토큰화, 불용어 제거 및 명사 추출을 포함하는 전처리 과정을 수행하는 단계, 전처리된 상기 경력 업무 데이터의 카테고리를 분류하는 단계 및 상기 경력 업무 데이터에 대해 긍정 및 부정 중 하나로 감성 분석을 수행하는 단계를 포함한다.As a technical means for solving the above-described technical problem, an embodiment according to the first aspect of the present disclosure provides a career work data analysis method. The method includes collecting career work data, performing a preprocessing process including tokenization, stopword removal, and noun extraction on the collected career work data, classifying categories of the preprocessed career work data, and It includes performing sentiment analysis on the career work data as either positive or negative.
또한, 본 개시의 제2 측면에 따른 실시예는, 경력 업무 데이터 분석 시스템을 제공한다. 본 시스템은, 통신 모듈, 적어도 하나의 프로세서 및 상기 프로세서와 전기적으로 연결되고, 상기 프로세서에서 수행되는 적어도 하나의 코드(code)가 저장되는 메모리를 포함하고, 상기 메모리는 상기 프로세서를 통해 실행될 때 상기 프로세서가, 경력 업무 데이터를 수집하고, 수집된 상기 경력 업무 데이터에 대해 토큰화, 불용어 제거 및 명사 추출을 포함하는 전처리 과정을 수행하고, 전처리된 상기 경력 업무 데이터의 카테고리를 분류하며, 상기 경력 업무 데이터에 대해 긍정 및 부정 중 하나로 감성 분석을 수행하도록 야기하는 코드를 저장한다.Additionally, an embodiment according to the second aspect of the present disclosure provides a career work data analysis system. The system includes a communication module, at least one processor, and a memory electrically connected to the processor and storing at least one code to be executed by the processor, and the memory stores the code when executed through the processor. A processor collects career work data, performs a preprocessing process including tokenization, stopword removal, and noun extraction on the collected career work data, classifies categories of the preprocessed career work data, and classifies the career work data into categories. Stores code that causes sentiment analysis to be performed on data as either positive or negative.
본 발명에 따르면, 다른 외부 서버로부터 경력이나 이력에 관련된 데이터를 수집함으로써, 프리랜서의 경력 업무에 대한 방대한 자료를 구축할 수 있다.According to the present invention, by collecting data related to career or career history from other external servers, it is possible to build a large amount of data on the freelancer's career work.
또한, 본 발명에 따르면, 수집한 경력 업무를 체계적인 기준으로 세분화하여 분류하고 저장함으로써, 추후 데이터의 가공 및 분석이 용이할 수 있다.Additionally, according to the present invention, later processing and analysis of data can be facilitated by subdividing, classifying, and storing the collected career tasks on a systematic basis.
또한, 본 발명에 따르면, 다른 외부 서버로부터 프리랜서에 대한 리뷰 데이터를 수집함으로써, 프리랜서의 경력을 판단하는 자료로 활용할 수 있다.Additionally, according to the present invention, by collecting review data about freelancers from other external servers, it can be used as data to judge the freelancer's career.
또한, 본 발명에 따르면, 수집한 리뷰 데이터의 속성을 기반으로 프리랜서의 평가가 긍정인지 부정인지를 판단함으로써, 신뢰도 높은 경력 증명을 도모할 수 있다.Additionally, according to the present invention, highly reliable career verification can be achieved by determining whether the freelancer's evaluation is positive or negative based on the attributes of the collected review data.
또한, 본 발명에 따르면, 방대한 양의 경력 업무 데이터를 수집하고 이를 효율적으로 관리하는 방법을 개시하며, 이에 따라 프리랜서 시장 관련 정보의 비대칭성 문제를 해결하는 기틀을 마련할 수 있다.In addition, according to the present invention, a method of collecting and efficiently managing a vast amount of career work data is disclosed, thereby laying the foundation for solving the problem of asymmetry of information related to the freelance market.
본 발명의 효과들은 상술한 효과들로 제한되지 않으며, 이하의 기재로부터 이해되는 모든 효과들을 포함한다.The effects of the present invention are not limited to the effects described above, and include all effects understood from the following description.
도 1은 본 발명의 일 실시예에 따른 경력 업무 데이터 분석 시스템을 설명하기 위해 도시한 도면이다.1 is a diagram illustrating a career work data analysis system according to an embodiment of the present invention.
도 2는 도 1에 도시된 서버의 세부구성을 도시한 도면이다.FIG. 2 is a diagram showing the detailed configuration of the server shown in FIG. 1.
도 3은 경력 업무 데이터를 분석하는 과정을 설명하기 위해 도시한 도면이다.Figure 3 is a diagram illustrating the process of analyzing career work data.
도 4는 LDA 기법을 이용하여 경력 업무 데이터의 카테고리를 분류하는 예를 설명하기 위해 도시한 도면이다. Figure 4 is a diagram illustrating an example of classifying categories of career work data using the LDA technique.
도 5는 본 발명의 다른 실시예에 따른 경력 업무 데이터 분석 시스템을 도시한 도면이다.Figure 5 is a diagram illustrating a career work data analysis system according to another embodiment of the present invention.
도 6은 본 발명의 다른 실시예에 따른 경력 업무 데이터 분석 방법의 순서를 도시한 흐름도이다.Figure 6 is a flowchart showing the sequence of a career work data analysis method according to another embodiment of the present invention.
이하에서는 첨부한 도면을 참조하여 본 개시를 상세히 설명하기로 한다. 다만, 본 개시는 여러 가지 상이한 형태로 구현될 수 있으며, 여기에서 설명하는 실시예들로 한정되는 것은 아니다. 또한, 첨부된 도면은 본 명세서에 개시된 실시예를 쉽게 이해할 수 있도록 하기 위한 것일 뿐, 첨부된 도면에 의해 본 명세서에 개시된 기술적 사상이 제한되지 않는다. 여기에 사용되는 기술용어 및 과학용어를 포함하는 모든 용어들은 본 개시가 속하는 기술분야에서 통상의 지식을 가진 자가 일반적으로 이해하는 의미로 해석되어야 한다. 사전에 정의된 용어들은 관련기술문헌과 현재 개시된 내용에 부합하는 의미를 추가적으로 갖는 것으로 해석되어야 하며, 별도로 정의되지 않는 한 매우 이상적이거나 제한적인 의미로 해석되지 않는다.Hereinafter, the present disclosure will be described in detail with reference to the attached drawings. However, the present disclosure may be implemented in various different forms and is not limited to the embodiments described herein. In addition, the attached drawings are only intended to facilitate understanding of the embodiments disclosed in this specification, and the technical idea disclosed in this specification is not limited by the attached drawings. All terms, including technical and scientific terms, used herein should be interpreted as meanings commonly understood by those skilled in the art in the technical field to which this disclosure pertains. Terms defined in the dictionary should be interpreted as having additional meanings consistent with the related technical literature and currently disclosed content, and should not be interpreted in a very ideal or limited sense unless otherwise defined.
도면에서 본 개시를 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 도면에 나타난 각 구성요소의 크기, 형태, 형상은 다양하게 변형될 수 있다. 명세서 전체에 대하여 동일/유사한 부분에 대해서는 동일/유사한 도면 부호를 붙였다. In order to clearly explain the present disclosure in the drawings, parts not related to the description are omitted, and the size, shape, and shape of each component shown in the drawings may be modified in various ways. Throughout the specification, identical/similar parts are given identical/similar reference numerals.
이하의 설명에서 사용되는 구성요소에 대한 접미사 "모듈" 및 "부" 등은 명세서 작성의 용이함만이 고려되어 부여되거나 혼용되는 것으로서, 그 자체로 서로 구별되는 의미 또는 역할을 갖는 것은 아니다. 또한, 본 명세서에 개시된 실시예를 설명함에 있어서 관련된 공지 기술에 대한 구체적인 설명이 본 명세서에 개시된 실시 예의 요지를 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략하였다. The suffixes “module” and “part” for components used in the following description are given or used interchangeably only for the ease of preparing the specification, and do not have distinct meanings or roles in themselves. Additionally, in describing the embodiments disclosed in this specification, if it is determined that detailed descriptions of related known technologies may obscure the gist of the embodiments disclosed in this specification, the detailed descriptions are omitted.
명세서 전체에서, 어떤 부분이 다른 부분과 "연결(접속, 접촉 또는 결합)"되어 있다고 할 때, 이는 "직접적으로 연결(접속, 접촉 또는 결합)"되어 있는 경우뿐만 아니라, 그 중간에 다른 부재를 사이에 두고 "간접적으로 연결 (접속, 접촉 또는 결합)"되어 있는 경우도 포함한다. 또한 어떤 부분이 어떤 구성요소를 "포함(구비 또는 마련)"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 "포함(구비 또는 마련)"할 수 있다는 것을 의미한다. Throughout the specification, when a part is said to be “connected (connected, contacted, or combined)” with another part, this means not only when it is “directly connected (connected, contacted, or combined),” but also when it has other members in between. It also includes cases where they are “indirectly connected (connected, contacted, or combined).” Additionally, when a part is said to "include (equip or provide)" a certain component, this does not exclude other components, unless specifically stated to the contrary, but rather "includes (provides or provides)" other components. It means you can.
본 명세서에서 사용되는 제1, 제2 등과 같이 서수를 나타내는 용어들은 하나의 구성 요소를 다른 구성요소로부터 구별하는 목적으로만 사용되며, 구성 요소들의 순서나 관계를 제한하지 않는다. 예를 들어, 본 개시의 제1구성요소는 제2구성요소로 명명될 수 있고, 유사하게 제2구성요소도 제1구성 요소로 명명될 수 있다. 본 명세서에서 사용되는 단수 표현의 형태들은 명백히 반대의 의미를 나타내지 않는 한 복수 표현의 형태들도 포함하는 것으로 해석되어야 한다. Terms representing ordinal numbers, such as first, second, etc., used in this specification are used only for the purpose of distinguishing one component from another component and do not limit the order or relationship of the components. For example, the first component of the present disclosure may be named a second component, and similarly, the second component may also be named a first component. As used herein, singular forms of expression should be construed to also include plural forms of expression, unless the contrary is clearly indicated.
도 1은 본 발명의 일 실시예에 따른 경력 업무 데이터 분석 시스템을 설명하기 위해 도시한 도면이다.1 is a diagram illustrating a career work data analysis system according to an embodiment of the present invention.
도 1을 참조하면, 경력 업무 데이터 분석 시스템은 경력 업무 데이터를 수집하고 분석하는 장치로서, 예를 들어 서버나 단말 형태로 구현될 수 있다. 본 발명의 일 실시예에 따르는, 경력 업무 데이터 분석 시스템은 업무 통합 관제 시스템에 포함되어 최적화된 온라인 업무 중개 및 경력 관리의 일환으로 구축될 수 있다.Referring to FIG. 1, the career work data analysis system is a device that collects and analyzes career work data, and may be implemented in the form of a server or terminal, for example. According to an embodiment of the present invention, the career work data analysis system can be included in a work integrated control system and built as part of optimized online work brokerage and career management.
서버(100)는 경력 업무 데이터를 수집하고, 수집된 경력 업무 데이터에 대해 토큰화, 불용어 제거 및 명사 추출을 포함하는 전처리 과정을 수행한다. 예컨대, 서버(100)는 통신 연결된 외부 서버, 외부 데이터베이스 및 사용자 단말 중 적어도 하나로부터 경력 업무 데이터를 수집할 수 있다. 여기서, 경력 업무 데이터는 리뷰 데이터를 포함할 수 있다.The server 100 collects career work data and performs a pre-processing process including tokenization, stop word removal, and noun extraction on the collected career work data. For example, the server 100 may collect career work data from at least one of a communication-connected external server, an external database, and a user terminal. Here, career work data may include review data.
서버(100)는 전처리된 경력 업무 데이터의 카테고리를 분류한다. 여기서, 카테고리는 동일한 주제 범위를 가지는 경력 업무데이터의 단어들로 구성된 것일 수 있다.The server 100 classifies the categories of the preprocessed career work data. Here, the category may be composed of words from career work data that have the same subject range.
서버(100)는 경력 업무 데이터에 대해 긍정 및 부정 중 하나로 감성 분석을 수행한다. The server 100 performs sentiment analysis on career work data as either positive or negative.
사용자 단말(200)은 서버(100)로 경력 업무 데이터를 전송할 수 있고, 서버(100)로부터 감성 분석 수행 결과를 수신할 수 있다.The user terminal 200 may transmit career work data to the server 100 and receive a sentiment analysis result from the server 100.
사용자 단말(200)은 통신 네트워크를 통해 서버(100)와 통신 연결될 수 있다. 사용자 단말(200)은 웹 브라우저(WEB Browser)가 탑재된 노트북, 데스크톱(desktop), 랩톱(laptop), 휴대성과 이동성이 보장되는 무선 통신 장치 또는 스마트폰, 태블릿 PC 등과 같은 모든 종류의 핸드헬드(Handheld) 기반의 무선 통신 장치를 의미할 수 있다.The user terminal 200 may be connected to the server 100 through a communication network. The user terminal 200 is a laptop equipped with a web browser, a desktop, a laptop, a wireless communication device that guarantees portability and mobility, or any type of handheld device such as a smartphone, tablet PC, etc. It may refer to a handheld (Handheld)-based wireless communication device.
도 2는 도 1에 도시된 서버의 세부구성을 도시한 도면이다.FIG. 2 is a diagram showing the detailed configuration of the server shown in FIG. 1.
도 2를 참조하면, 서버(100)는 통신 모듈(110), 프로세서(120) 및 메모리(130)를 포함할 수 있다.Referring to FIG. 2 , the server 100 may include a communication module 110, a processor 120, and a memory 130.
통신 모듈(110)은 다른 네트워크 장치와 유무선 연결을 통해 제어 신호 또는 데이터 신호와 같은 신호를 송수신하기 위해 필요한 하드웨어 및 소프트웨어를 포함하는 장치를 포함할 수 있다.The communication module 110 may include a device including hardware and software necessary to transmit and receive signals such as control signals or data signals through wired or wireless connections with other network devices.
통신 모듈(110)은 사용자 단말, 통신 연결된 외부 서버 및 외부 데이터베이스 중 적어도 하나로부터 경력 업무 데이터를 수신할 수 있다. 또한, 통신 모듈(110)은 경력 업무 데이터에 대한 감성 분석 결과를 사용자 단말, 통신 연결된 외부 서버 및 외부 데이터베이스 중 적어도 하나로 전송할 수 있다. 여기서, 외부 데이터베이스는 웹이나 애플리케이션 상 특정 구인구직 사이트에서 발생하는 각종 데이터가 저장된 장치로서, 경력 업무 데이터 분석 시스템과 네트워크로 연동되어 내부에 저장된 경력 업무 데이터를 제공할 수 있다.The communication module 110 may receive career work data from at least one of a user terminal, a communication-connected external server, and an external database. Additionally, the communication module 110 may transmit the results of emotional analysis of career work data to at least one of a user terminal, a communication-connected external server, and an external database. Here, the external database is a device that stores various data generated from a specific job search site on the web or application, and can be linked to the career work data analysis system and network to provide the career work data stored internally.
일 실시예에 따르면, 외부 데이터베이스는 구인구직 사이트의 각종 절차를 제어하는 외부 서버에 포함된 것일 수 있으며, 클라우드 서버로 구현되어 공간과 무관하게 지속적으로 데이터를 제공하는 것이 바람직하다.According to one embodiment, the external database may be included in an external server that controls various procedures of the job search site, and is preferably implemented as a cloud server to continuously provide data regardless of space.
프로세서(120)는 데이터를 제어 및 처리하는 다양한 종류의 장치들을 포함할 수 있다. 프로세서(120)는 프로그램 내에 포함된 코드 또는 명령으로 표현된 기능을 수행하기 위해 물리적으로 구조화된 회로를 갖는, 하드웨어에 내장된 데이터 처리 장치를 의미할 수 있다.The processor 120 may include various types of devices that control and process data. The processor 120 may refer to a data processing device built into hardware that has a physically structured circuit to perform functions expressed by codes or instructions included in a program.
일 예에서, 프로세서(120)는 마이크로프로세서(microprocessor), 중앙처리장치(central processing unit: CPU), 프로세서 코어(processor core), 멀티프로세서(multiprocessor), ASIC(application-specific integrated circuit), FPGA(field programmable gate array) 등의 형태로 구현될 수 있으나, 본 발명의 범위가 이에 한정되는 것은 아니다.In one example, the processor 120 may include a microprocessor, a central processing unit (CPU), a processor core, a multiprocessor, an application-specific integrated circuit (ASIC), or an FPGA ( It may be implemented in the form of a field programmable gate array, etc., but the scope of the present invention is not limited thereto.
프로세서(120)는 메모리(130)에 저장된 코드에 따라 동작을 수행한다.The processor 120 performs operations according to the code stored in the memory 130.
메모리(130)는 통신 모듈(110)로 입력되는 정보 및 데이터, 프로세서(120)에 의해 수행되는 기능에 필요한 정보 및 데이터, 프로세서(120)의 실행에 따라 생성된 데이터 중 적어도 어느 하나 이상을 저장할 수 있다.The memory 130 stores at least one of information and data input to the communication module 110, information and data required for functions performed by the processor 120, and data generated according to execution of the processor 120. You can.
메모리(130)는 전원이 공급되지 않아도 저장된 정보를 계속 유지하는 비휘발성 저장장치 및 저장된 정보를 유지하기 위하여 전력을 필요로 하는 휘발성 저장장치를 통칭하는 것으로 해석되어야 한다. 메모리(130)는 저장된 정보를 유지하기 위하여 전력이 필요한 휘발성 저장장치 외에 자기 저장 매체(magnetic storage media) 또는 플래시 저장 매체(flash storage media)를 포함할 수 있으나, 본 발명의 범위가 이에 한정되는 것은 아니다. Memory 130 should be interpreted as a general term for non-volatile storage devices that continue to retain stored information even when power is not supplied and volatile storage devices that require power to maintain stored information. The memory 130 may include magnetic storage media or flash storage media in addition to volatile storage devices that require power to maintain stored information, but the scope of the present invention is not limited thereto. no.
메모리(130)는 프로세서(120)와 전기적으로 연결되고, 프로세서(120)에서 수행되는 적어도 하나의 코드가 저장된다. 메모리(130)는 프로세서(120)를 통해 실행될 때 프로세서(120)가 다음과 같은 기능 및 절차들을 수행하도록 야기하는 코드가 저장된다.The memory 130 is electrically connected to the processor 120 and stores at least one code executed by the processor 120. The memory 130 stores code that, when executed through the processor 120, causes the processor 120 to perform the following functions and procedures.
메모리(130)에는 경력 업무 데이터를 수집하도록 야기하는 코드가 저장된다. 예컨대, 경력 업무 데이터는 통신 연결된 외부 서버, 외부 데이터베이스 및 사용자 단말 중 적어도 하나로부터 수집될 수 있다. Memory 130 stores code that causes career work data to be collected. For example, career work data may be collected from at least one of a communication-connected external server, an external database, and a user terminal.
여기서, 경력 업무 데이터란 프리랜서들의 경력이나 업무이력을 증명할 수 있는 각종 데이터들의 총체를 의미하며, 온라인 상에 분포되어 있으면 족한 것으로 정형/비정형적인 형태나, 종류는 본 발명을 제한하지 않는다. 문서데이터인 경우 신분증, 통장사본, 개인정보동의서, 이력서, 근무확인서, 근로정산내역서, 비용지급확인서, 업무일지 등을 예로 들 수 있으나, 이에 한정되는 것은 아니다.Here, career work data refers to the totality of various data that can prove the career or work history of freelancers, and it is sufficient if it is distributed online, and the formal/unstructured form or type does not limit the present invention. In the case of document data, examples include, but are not limited to, ID cards, bankbook copies, personal information consent forms, resumes, work confirmations, work settlement statements, expense payment confirmations, and work logs.
본 발명에서 경력 업무 데이터는 경력(연수/시간수/기간 등), 근무처, 직책, 과업, 역할, 사용 프로그램, 업무분야, 세부업무내역, 참여율, 공동작업자, 과업명, 과업수행 목표, 활용목적 및 채팅이나 후기와 같은 해당 경력자에 대해 평가한 리뷰 데이터를 포함할 수 있다.In the present invention, career work data includes experience (years/hours/period, etc.), workplace, position, task, role, program used, work field, detailed work history, participation rate, collaborator, task name, task performance goal, and purpose of use. And it may include review data evaluating the relevant experienced person, such as chat or reviews.
본 발명에서 "프리랜서"란, 일정 소속이 없이 자유 계약으로 일하는 자에 국한되는 것이 아니라, 특정 업무를 수행하기 위해 고용되어 노동을 제공하고 대가를 수급하는 개인 및 단체를 모두 포함하는 것으로 해석되어야 한다.In the present invention, "freelancer" is not limited to a person who works on a free contract without any affiliation, but should be interpreted to include all individuals and organizations who are hired to perform a specific task, provide labor, and receive compensation. .
메모리(130)에는 수집된 경력 업무 데이터에 대해 토큰화, 불용어 제거 및 명사 추출을 포함하는 전처리 과정을 수행하도록 야기하는 코드가 저장된다. Memory 130 stores code that causes preprocessing, including tokenization, stopword removal, and noun extraction, to be performed on the collected career work data.
예를 들어, 메모리(130)에 저장된 코드에 기초하여, "디자인이 모두 너무 예쁘게 나왔어요..."라는 문장은 "디자인", "이", "모두", "너무", "예쁘게", "나왔어요", "..."으로 토큰화될 수 있다. 메모리(130)에 저장된 코드에 기초하여, "디자인", "이", "모두", "너무", "예쁘게", "나왔어요", "..." 중 불용어가 제거된 "디자인", "너무", "예쁘게"만이 추출될 있다. 메모리(130)에 저장된 코드에 기초하여, "디자인", "너무", "예쁘게" 중 "디자인"이라는 명사만이 추출될 수 있다.For example, based on the code stored in the memory 130, the sentence "The designs all came out so pretty..." includes "design", "this", "all", "too", "pretty", " It can be tokenized as “I came out”, “...”. Based on the code stored in the memory 130, stop words among "design", "this", "all", "too", "pretty", "came out", "..." were removed, "design", " Only “too” and “pretty” can be extracted. Based on the code stored in the memory 130, only the noun "design" among "design", "too", and "pretty" can be extracted.
메모리(130)에는 구인구직 사이트에 기재된 의뢰업체의 업무의뢰서의 업무요구사항 데이터와 마이크로 업무 데이터를 수집하고, 수집된 경력 업무 데이터들을 직종, 업무 및 마이크로 업무를 기준으로 분류하도록 야기하는 코드가 저장될 수 있다. 예컨대, 직종은 직업이나 직무의 종류이고, 업무는 객관적으로 상당한 횟수를 반복하여 행하여지는 것 또는 반복을 계속할 의사로 행하여지는 것이고, 마이크로 업무는 업무를 더 작은 단위로 세분화하여 조각 노동을 통해 업무의 일부를 수행하는 것일 수 있다.The memory 130 stores a code that collects work requirements data and micro-task data from job requests from client companies listed on job search sites, and classifies the collected career work data based on occupation, task, and micro-task. It can be. For example, an occupation is a type of occupation or job, and a task is something that is objectively performed repeatedly a considerable number of times or is performed with the intention of continuing repetition, and a micro task is one that subdivides the task into smaller units and performs the work through piecework. It may be that it does some of the work.
여기서, 구인구직 사이트에 노출된 업무 설명 영역에는 해당 업무에 관하여 세부단위로 설명되어 있다. 예를 들어, 사무직-디자인업무-포토샵과 같이 점차 그 분야가 세분화되어 표시된 경우 "사무직"의 데이터는 직종그룹으로 "디자인"의 데이터는 업무그룹으로, "포토샵"의 데이터는 마이크로 업무그룹으로 분류할 수 있고, 이하 포토샵과 같이 가장 미세한 단위의 업무를 "마이크로 업무"로 정의할 수 있다.Here, the job description area exposed on the job search site explains the job in detail. For example, if the field is gradually divided into categories such as office work - design work - Photoshop, the data for "office work" is classified as an occupation group, the data for "design" is classified as a work group, and the data for "Photoshop" is classified as a micro work group. It can be done, and hereinafter, the smallest unit of work, such as Photoshop, can be defined as "micro work."
메모리(130)에는 전처리된 경력 업무 데이터의 카테고리를 분류하도록 야기하는 코드가 저장될 수 있다. 예컨대, 메모리(130)에는 LDA(Latent Dirichlet Allocation)기법을 활용하여 경력 업무 데이터의 카테고리 분류를 수행하도록 야기하는 코드가 저장될 수 있다.The memory 130 may store a code that causes the preprocessed career work data to be classified into categories. For example, the memory 130 may store a code that causes category classification of career work data using an LDA (Latent Dirichlet Allocation) technique.
구체적으로, 경력 업무 데이터 분석 시스템은 기저장된 업무 별로 카테고리(토픽)을 정하고 전처리를 통해 추출된 각 경력 업무 데이터의 명사들이 어느 카테고리에 군집되어야 하는지를 판정하고, 각각의 명사를 판정된 카테고리로 분류할 수 있다.Specifically, the career work data analysis system determines categories (topics) for each pre-stored task, determines which category the nouns in each career work data extracted through preprocessing should be clustered, and classifies each noun into the determined category. You can.
이에 따라, 각 카테고리는 동일한 주제 범위를 가지는 경력 업무 데이터의 단어로만 구성될 수 있다. 예를 들어, 카테고리가 "번역"인 경우, 문법, 표현력, 단어사용 등의 단어가 구성될 수 있으며, "피피티"인 경우, 디자인, 레이아웃, 통일성 등의 단어가 구성될 수 있다.Accordingly, each category may be composed only of words from career work data having the same subject range. For example, if the category is “translation,” words such as grammar, expression, and word usage may be included, and if the category is “PPT,” words such as design, layout, and unity may be included.
LDA기법은, 카테고리 내 중복된 단어가 구성되거나 관련이나 의미가 없는 단어가 포함될 수 있다. 이에 따라, 메모리(130)에는 카테고리로 분류된 경력 업무 데이터의 재분류 과정을 수행하도록 야기하는 코드가 저장될 수 있다. 예컨대, 비율이라는 단어가 확인되었지만 디자인과 관련된 단어가 함께 구성되어 있지 않은 경우 디자인 카테고리로 분리하지 않고 새로운 판단을 통해 새로운 카테고리로 분류될 수 있다.The LDA technique may consist of duplicate words within a category or may include words that are not related or have no meaning. Accordingly, a code that causes a reclassification process of career work data classified into categories may be stored in the memory 130. For example, if the word proportion is confirmed but no words related to design are included, it may be classified into a new category through a new judgment rather than being separated into the design category.
메모리(130)에는 딥러닝 모델인 워드투벡터(Word2Vec)를 이용하여 200차원 skip-gram방식으로 경력 업무 데이터의 단어들을 학습시켜 각 단어들의 벡터값을 Vector space Model로 구축하도록 야기하는 코드가 저장될 수 있다. 또한, 메모리(130)에는 수학식 1에 기초하여 각 단어 사이 또는 단어와 카테고리 사이의 유사도를 산출하고 이를 기초로 데이터들을 재분류하도록 야기하는 코드가 저장될 수 있다. 예컨대, 유사도가 일정 기준 이상인 단어들과 상기 일정 기준 미만인 단어들을 분류할 수 있다. The memory 130 stores a code that learns words from career work data in a 200-dimensional skip-gram method using Word2Vec, a deep learning model, and constructs the vector value of each word as a vector space model. It can be. Additionally, a code that calculates the similarity between each word or between a word and a category based on Equation 1 and causes the data to be reclassified based on this may be stored in the memory 130. For example, words whose similarity is above a certain standard can be classified into words whose similarity is less than the certain standard.
Figure PCTKR2023008436-appb-img-000001
Figure PCTKR2023008436-appb-img-000001
여기서, A는 행렬에 따른 기준값이 되는 단어이고, B는 추출된 단어이며, 수학식 1을 통해 단어 A와 B의 유사도를 산출할 수 있다.Here, A is a word that is a reference value according to the matrix, B is an extracted word, and the similarity between words A and B can be calculated through Equation 1.
리뷰 데이터는 일반적으로 문장 또는 문단 단위의 데이터이므로, 기존에는 긍정 및 부정으로 판단하기에 어려움이 있었다. 예를 들어, 리뷰 데이터가 대체로 긍정적인 피드백이 담겨있더라도 마지막에 "하지만 이 프리랜서와 일하지 않을 것"이라고 기재된 경우, 해당 리뷰 데이터는 부정으로 결정되어야 하나 대부분 긍정으로 분류되곤 하였다.Review data is generally sentence- or paragraph-level data, so it was previously difficult to determine whether it was positive or negative. For example, even if the review data contains mostly positive feedback, if it ends with “but I will not work with this freelancer,” the review data should be determined as negative, but it would mostly be classified as positive.
상기 문제점을 해소하기 위하여 본 발명의 메모리(130)에는 분류된 경력 업무 데이터 별로 속성이 긍정인지 부정인지 판단하도록 야기하는 코드가 저장될 수 있다. 예컨대, 메모리(130)에는 고객의 만족도 별점 및 향후 재매칭 정도에 따라 긍정 단어의 실효성에 대한 가중치를 조절하도록 야기하는 코드가 저장될 수 있다. 예를 들어, "네.", "좋습니다.", "작업하시느라 고생많으셨습니다."와 같은 긍정응답이더라도, 개별 별점이 낮거나 다시 매칭하지 않으면 부정적으로 판단하고, 다음 매칭이 진행되거나 별점이 높은 경우는 긍정적으로 판단할 수 있다.In order to solve the above problem, the memory 130 of the present invention may store a code that determines whether the attribute is positive or negative for each classified career work data. For example, a code that causes the weight of the effectiveness of a positive word to be adjusted according to the customer's satisfaction rating and the degree of future rematching may be stored in the memory 130. For example, even if it is a positive response such as “Yes,” “It’s good,” or “Thank you for your hard work,” it is judged negatively if the individual rating is low or if matching is not done again, and the next matching will proceed or the rating will be lowered. If it is high, it can be judged positively.
메모리(130)에는 리뷰 데이터를 속성단위로 변환하여 각 속성마다 장단점을 분석하도록 야기하는 코드가 저장될 수 있다. 예컨대, 메모리(130)에는 각 단어에 해당하는 속성값을 부여하고, 해당 단어에 대한 유사어를 함께 지정하여 속상 파악이 가능한 데이터셋을 구성하도록 야기하는 코드가 저장될 수 있다.The memory 130 may store a code that converts review data into attribute units and analyzes the pros and cons of each attribute. For example, the memory 130 may store a code that assigns attribute values corresponding to each word and specifies similar words for the corresponding word to form a data set capable of understanding the underlying meaning.
일 예로, 리뷰 데이터가 수집됨에 따라 속성은 계속 누적되므로, 기설정된 통계분석으로 유의미한 속성만을 추출할 수 있다. 바람직하게는, 전체 리뷰 데이터에서 명사 단위의 속성 파악이 수행될 수 있다.For example, as review data is collected, attributes continue to accumulate, so only meaningful attributes can be extracted through preset statistical analysis. Preferably, noun-level attribute identification can be performed on the entire review data.
특정 경력 업무 데이터에서 분류된 속성은 핵심 속성, 주변 속성 및 커뮤니케이션 속성을 포함할 수 있으며, 속성 각각에 대한 예시는 표 1과 같을 수 있다.Attributes classified from specific career work data may include core attributes, peripheral attributes, and communication attributes, and examples of each attribute may be as shown in Table 1.
구분division 내용detail
핵심 속성core attributes 결과물, 비용, 시간Output, Cost, Time
주변 속성surrounding properties 마이크로 업무micro tasks
커뮤니케이션 속성communication properties 의사소통Communication
메모리(130)에는 경력 업무 데이터에 대해 긍정 및 부정 중 하나로 감성 분석을 수행하도록 야기하는 코드가 저장된다. 예컨대, 메모리(130)에는 경력 업무 데이터 중 경력자에 대한 평가가 담긴 리뷰 데이터에 대해 긍정 및 부정 중 하나로 감성 분석을 수행하도록 야기하는 코드가 저장될 수 있다.메모리(130)에는 경력 업무 데이터에 대한 주관성 탐지 기법을 수행하여, 경력 업무 데이터 중 사람의 주관이 나타난 부분만을 추출하도록 야기하는 코드가 저장될 수 있다. 예컨대, 주관성 탐지 기법은 경력 업무 데이터 중 감성과는 관련이 없는 부분 및 개인 정보를 제거하여, 감성분석에 사용될 요소만을 분류하는 탐지 기법일 수 있다.The memory 130 stores a code that causes sentiment analysis to be performed on career work data as either positive or negative. For example, the memory 130 may store a code that causes sentiment analysis to be performed as either positive or negative on review data containing evaluations of experienced employees among the career work data. Code may be stored that performs a subjectivity detection technique to extract only those parts of the career work data in which a person's subjectivity appears. For example, the subjectivity detection technique may be a detection technique that classifies only the elements to be used for emotional analysis by removing parts of career work data that are not related to emotions and personal information.
메모리(130)에는 경력 업무 데이터 중 사람의 주관이 나타난 부분에 대한 극성 탐지 기법을 수행하여, 사람의 주관이 나타난 부분에 대한 긍정 및 부정 중 하나의 감성으로 분류하도록 야기하는 코드가 저장될 수 있다. 예컨대, 극성 탐지 기법은 텍스트 속의 긍정 및 부정의 단어를 탐지하여 정량화하고, 긍정 및 부정을 나타내는 가중치를 적용하여 텍스트를 포함하는 문장이 긍정인지 부정인지 판단하는 탐지 기법일 수 있다.The memory 130 may store a code that performs a polarity detection technique on the part of the career work data in which a person's subjectivity appears, and causes the part in which the person's subjectivity appears to be classified as one of positive and negative emotions. . For example, a polarity detection technique may be a detection technique that detects and quantifies positive and negative words in text and determines whether a sentence containing the text is positive or negative by applying weights representing positive and negative words.
메모리(130)에는 경력 업무 데이터에서 동사 및 형용사를 추출하여 수학식 2를 통해 감성 단어를 분석하도록 야기하는 코드가 저장될 수 있다.The memory 130 may store a code that extracts verbs and adjectives from career work data and analyzes emotional words through Equation 2.
Figure PCTKR2023008436-appb-img-000002
Figure PCTKR2023008436-appb-img-000002
Figure PCTKR2023008436-appb-img-000003
Figure PCTKR2023008436-appb-img-000003
Figure PCTKR2023008436-appb-img-000004
Figure PCTKR2023008436-appb-img-000004
Figure PCTKR2023008436-appb-img-000005
Figure PCTKR2023008436-appb-img-000005
여기서, x는 속성이 포함된 문장이고, c는 긍·부정을 판단하는 class일 수 있으며, 문장 ‘x’가 긍정이면 +1, 부정이 면 -1의 값을 가질 수 있다.Here, x is a sentence containing an attribute, c may be a class that determines positivity or negativity, and if sentence ‘x’ is positive, it can have a value of +1, and if it is negative, it can have a value of -1.
TC-W2V는 Word2Vec를 이용하여 토픽 내의 단어 간 연관성(relatedness)을 측정하는 것을 의미하고, N은 토픽 내 상위 k개의 단어를 의미하며, K는 전체 토픽 수를 의미한다.
Figure PCTKR2023008436-appb-img-000006
는 Word2Vec로부터 계산된 두 단어
Figure PCTKR2023008436-appb-img-000007
Figure PCTKR2023008436-appb-img-000008
사이의 유사도 값을 의미할 수 있다.
TC-W2V means measuring the relatedness between words in a topic using Word2Vec, N means the top k words in the topic, and K means the total number of topics.
Figure PCTKR2023008436-appb-img-000006
are two words calculated from Word2Vec
Figure PCTKR2023008436-appb-img-000007
and
Figure PCTKR2023008436-appb-img-000008
It can mean the similarity value between the two.
도 3은 경력 업무 데이터를 분석하는 과정을 설명하기 위해 도시한 도면이다.Figure 3 is a diagram illustrating the process of analyzing career work data.
도 3을 참조하면, 경력 업무 데이터 분석 시스템은 데이터 수집부(310), 데이터 전처리부(320) 및 LDA 처리부(330)를 포함할 수 있다.Referring to FIG. 3, the career work data analysis system may include a data collection unit 310, a data pre-processing unit 320, and an LDA processing unit 330.
데이터 수집부(310)는 사용자 단말, 경력 업무 데이터 분석 시스템과 통신 연결된 외부 서버 및 외부 데이터베이스로부터 경력 업무 데이터를 수집할 수 있다.The data collection unit 310 may collect career work data from a user terminal, an external server connected to communication with the career work data analysis system, and an external database.
데이터 수집부(310)는 구인구직 사이트에 기재된 경력 업무 데이터를 수집할 수 있다. 예를 들어, 데이터 수집부(310)는 인터넷 크롤링(crawling)을 수행하여 외부에 분산되어 있는 경력 업무 데이터를 끌어올 수 있다.The data collection unit 310 may collect career work data listed on a job search site. For example, the data collection unit 310 may perform Internet crawling to retrieve career work data distributed externally.
구체적으로, 네트워크 상에 분산되어 있는 구인구직 서비스와 관련된 URL을 탐색하고, 탐색한 URL 내에서 다른 하이퍼링크들을 찾아 분류하고 저장하는 작업을 반복한다. 이러한 방식으로 인터넷 상 구인구직 사이트의 웹페이지를 돌아다니며 각각의 외부 데이터베이스에서 경력 업무 데이터를 찾아내고 각 데이터들마다 어느 곳에 위치하는지 색인을 만들어 구분한 후 경력 업무 데이터 분석 시스템 내의 데이터베이스에 저장할 수 있다.Specifically, the process of searching for URLs related to job search services distributed on the network and finding, classifying, and storing other hyperlinks within the searched URLs is repeated. In this way, you can browse the web pages of job search sites on the Internet, find career data from each external database, create an index to identify where each data is located, and save it in the database within the career data analysis system. .
데이터 전처리부(320)는 수집된 경력 업무 데이터에 대한 전처리 과정을 수행할 수 있다.The data preprocessing unit 320 may perform a preprocessing process on the collected career work data.
예컨대, 데이터 전처리부(320)는 경력 업무 데이터를 대상으로 토큰화, 불용어 제거 및 명사 추출을 순차적으로 수행하여 해당 경력 업무 데이터가 대표적으로 표상하는 의미가 무엇인지 또는 표상하는 업무가 무엇인지에 대한 표준화를 수행할 수 있다. 아래 표 2는 특정 리뷰 데이터에 대한 전처리 과정의 예시를 도시한 것이다.For example, the data preprocessing unit 320 sequentially performs tokenization, stopword removal, and noun extraction on career work data to determine what meaning or task the career work data typically represents. Standardization can be performed. Table 2 below shows an example of the preprocessing process for specific review data.
데이터data 리뷰review
원본original 디자인이 모두 너무 예쁘게 나왔어요...All of the designs came out so pretty...
토큰화Tokenization 디자인, 이, 모두, 너무, 예쁘게, 나왔어요, ...The design, everything, came out so, pretty, ...
불용어 제거Remove stop words 디자인, 너무, 예쁘게Design, so, pretty
명사추출noun extraction 디자인design
LDA 처리부(330)는 데이터 전처리부(320)를 통해 전처리된 경력 업무 데이터의 카테고리를 분류할 수 있다. 예컨대, LDA 처리부(330)는 LDA(Latent Dirichlet Allocation)기법을 활용하여 경력 업무 데이터의 카테고리 분류를 수행할 수 있다. 예컨대, LDA 처리부(330)는 경력 업무 데이터를 LDA기법에 기반하여 분석하고, 분석된 결과에 따른 적어도 하나 이상의 카테고리들 각각의 능력치를 그래프 형태로 나타낼 수 있다.The LDA processing unit 330 may classify the categories of career work data preprocessed through the data preprocessing unit 320. For example, the LDA processing unit 330 may perform category classification of career work data using the Latent Dirichlet Allocation (LDA) technique. For example, the LDA processing unit 330 may analyze career work data based on the LDA technique and display the capabilities of each of at least one category according to the analysis results in the form of a graph.
도 4는 LDA 기법을 이용하여 경력 업무 데이터의 카테고리를 분류하는 예를 설명하기 위해 도시한 도면이다. Figure 4 is a diagram illustrating an example of classifying categories of career work data using the LDA technique.
도 4를 참조하면, 경력 업무 데이터 분석 시스템은 LDA(Latent Dirichlet Allocation; 잠재 디리클레 할당)기법을 활용하여 경력 업무 데이터의 카테고리 분류를 수행할 수 있다. LDA 기법은 토픽 모델링의 가장 대표적인 알고리즘으로, 확률 기반의 모델링 기법을 통해 방대한 양의 문서 데이터를 분석함으로써, 문서 내에 어떤 토픽이 어떤 비율로 구성되어 있는지를 분석할 수 있다.Referring to Figure 4, the career work data analysis system can perform categorization of career work data using the LDA (Latent Dirichlet Allocation) technique. The LDA technique is the most representative algorithm for topic modeling. By analyzing a large amount of document data through a probability-based modeling technique, it is possible to analyze which topics are composed in a document and at what ratio.
구체적으로, 경력 업무 데이터 분석 시스템은 업무와 관련된 각각의 명사에 해당 토픽값을 미리 정하고, 업무 진행시에 새롭게 발생한 단어들을 분석하여 다양한 카테고리 중 어느 카테고리에 적합한지를 분석할 수 있다.Specifically, the career work data analysis system can predetermine the corresponding topic value for each noun related to work and analyze which of the various categories it fits into by analyzing newly occurring words while work is in progress.
카테고리는 각각 동일한 주제 범위를 가지는 경력 업무 데이터의 단어로만 구성될 수 있다. 예를 들어, 카테고리가 "번역"인 경우, 문법, 표현력, 단어사용, 맞춤법, 속도, 문장력, 전달력 등의 단어로 구성될 수 있으며, "피피티"인 경우, 디자인, 레이아웃, 통일성 등의 단어로 구성될 수 있다.Categories can only consist of words from career work data, each with the same subject scope. For example, if the category is “translation,” it may consist of words such as grammar, expression, word usage, spelling, speed, writing power, and delivery, and if the category is “PPT,” it may consist of words such as design, layout, and unity. It can be composed of:
경력 업무 데이터 분석 시스템은 수집된 리뷰 데이터를 기준으로 카테고리를 설정하거나 기 설정된 업무명을 기준으로 카테고리를 설정할 수 있다. 예컨대, 경력 업무 데이터 분석 시스템은 색상, 레이아웃, 비율 등의 단어를 디자인 업무의 카테고리들로 최초 설정을 하고, 한번 설정된 값을 기준으로 레이아웃이라는 데이터가 들어왔을 때 디자인 카테고리로 분류할 수 있다.The career work data analysis system can set categories based on collected review data or set categories based on preset job names. For example, the career work data analysis system can initially set words such as color, layout, and ratio as categories of design work, and classify them into design categories when data called layout comes in based on the values once set.
디자이너의 경력 업무 데이터를 분석하여 피그마, 포토샵, 색상, 레이아웃 및 일러스트 등의 카테고리들 각각의 능력치를 그래프 형태로 나타낼 수 있다. 같은 원리로, 번역가의 경력 업무 데이터를 분석하여 표현력, 맞춤법, 속도, 문장력 및 전달력 등의 카테고리들 각각의 능력치를 그래프 형태로 나타낼 수 있다. 마케터의 경력 업무 데이터를 분석하여 구글 콘솔활용, GA 활용, 인스타그램 팔로워 모집력, 페이스북 활용력 및 문장력 등의 카테고리들 각각의 능력치를 그래프 형태로 나타낼 수 있다.By analyzing the designer's career work data, the abilities of each category such as figma, Photoshop, color, layout, and illustration can be displayed in graph form. Using the same principle, by analyzing the translator's career work data, the ability values of each category such as expression, spelling, speed, writing ability, and delivery ability can be displayed in the form of a graph. By analyzing the marketer's career work data, the abilities of each category, such as Google Console usage, GA usage, Instagram follower recruitment ability, Facebook usage ability, and writing skills, can be displayed in graph form.
도 5는 본 발명의 다른 실시예에 따른 경력 업무 데이터 분석 시스템을 도시한 도면이다.Figure 5 is a diagram illustrating a career work data analysis system according to another embodiment of the present invention.
도 5를 참조하면, 경력 업무 데이터 분석 시스템은 매칭관리 시스템 및 경력관리 시스템을 포함할 수 있다.Referring to Figure 5, the career work data analysis system may include a matching management system and a career management system.
매칭관리 시스템은 온라인 업무 매칭 서비스에 대한 전반적인 프로세스를 처리할 수 있다. 특히, 의뢰자와 수행자의 상호 니즈를 실시간 체크하고, 만족도 및 피드백을 기초로 매칭을 수행하며, 마이크로 업무의 경력 데이터를 지속적으로 확보할 수 있다.The matching management system can handle the overall process for online business matching services. In particular, it is possible to check the mutual needs of the client and performer in real time, perform matching based on satisfaction and feedback, and continuously secure career data for micro tasks.
매칭관리 시스템은 고객 관리, 업무 관리, 정보 데이터 관리, 역량 관리, 관리자 관리 및 구매/비용 관리 중 적어도 하나를 관리할 수 있다.The matching management system can manage at least one of customer management, task management, information data management, competency management, administrator management, and purchase/cost management.
경력관리 시스템은 본 발명의 일 실시예에 따르는 경력 업무 데이터 분석 시스템이 구성되어 있으며, 외부 데이터베이스를 포함하는 각종 외부 자료들에 컨택하여 경력 업무와 관련된 추가 데이터를 지속적으로 확보할 수 있다. 이에 따라 프리랜서의 경력, 이력 및 임금가격의 표준화된 기준을 제시할 수 있다.The career management system is comprised of a career work data analysis system according to an embodiment of the present invention, and can continuously secure additional data related to career work by contacting various external data, including external databases. Accordingly, standardized standards for freelancers' experience, history, and wage prices can be presented.
경력관리 시스템은 작업자 및 수행자 각각의 업무 및 경력 관리, 정보 데이터 관리, 경력 분석 관리, 문서 및 증명 관리, 마이크로 업무 및 프리랜서 경력 관리 중 적어도 하나를 관리할 수 있다.The career management system can manage at least one of the tasks and career management of each worker and performer, information data management, career analysis management, document and certification management, micro tasks, and freelance career management.
매칭관리 시스템에서 고객과 업무 수행자 간의 업무 매칭이 진행되며, 수행 업무에 따른 피드백이 발생할 수 있다. 여기서 발생한 고객의 만족도 및 피드백은 경력관리 시스템하에 정보데이터로 활용되어 경력(역량)을 분석하여 고객에게 제공할 수 있다.In the matching management system, task matching is carried out between customers and task performers, and feedback can occur according to the tasks performed. Customer satisfaction and feedback generated here can be used as information data under the career management system to analyze careers (competencies) and provide them to customers.
도 6은 본 발명의 다른 실시예에 따른 경력 업무 데이터 분석 방법의 순서를 도시한 흐름도이다.Figure 6 is a flowchart showing the sequence of a career work data analysis method according to another embodiment of the present invention.
이하에서 설명될 경력 업무 데이터 분석 방법은 앞서 도 1 내지 도 5를 참조하여 설명한 경력 업무 데이터 분석 시스템 및 서버에 의해 수행될 수 있다. 따라서, 앞서 도 1 내지 도 5를 참조하여 설명한 본 개시의 실시예에 대한 내용은 이하에서 설명될 실시예에도 동일하게 적용될 수 있으며, 이하에서 상술한 설명과 중복되는 내용은 생략하도록 한다. 이하에서 설명되는 단계들은 반드시 순서대로 수행되어야 하는 것은 아니고, 단계들의 순서는 다양하게 설정될 수 있으며, 단계들은 거의 동시에 수행될 수도 있다.The career work data analysis method to be described below may be performed by the career work data analysis system and server previously described with reference to FIGS. 1 to 5. Accordingly, the contents of the embodiments of the present disclosure previously described with reference to FIGS. 1 to 5 can be equally applied to the embodiments to be described below, and contents that overlap with the description described above will be omitted below. The steps described below do not necessarily have to be performed in order, the order of the steps may be set in various ways, and the steps may be performed almost simultaneously.
도 6을 참조하면, 경력 업무 데이터 분석 방법은 경력 업무 데이터의 전처리 단계(S100), 경력 업무 데이터의 카테고리 분류 단계(S200) 및 경력 업무 데이터의 감성 분석 단계(S300)를 포함한다.Referring to FIG. 6, the career work data analysis method includes a preprocessing step of career work data (S100), a categorization step of career work data (S200), and a sentiment analysis step of career work data (S300).
경력 업무 데이터의 전처리 단계(S100)는 경력 업무 데이터를 수집하고, 수집된 경력 업무 데이터에 대해 토큰화, 불용어 제거 및 명사 추출을 포함하는 전처리 과정을 수행하는 단계이다. 예컨대, 경력 업무 데이터의 전처리 단계(S100)에서 통신 연결된 외부 서버, 외부 데이터베이스 및 사용자 단말 중 적어도 하나로부터 경력자에 대한 평가가 포함된 리뷰 데이터를 포함하는 경력 업무 데이터를 수집할 수 있다.The preprocessing step (S100) of career work data is a step in which career work data is collected and a preprocessing process including tokenization, stopword removal, and noun extraction is performed on the collected career work data. For example, in the pre-processing step (S100) of career work data, career work data including review data including an evaluation of a career employee may be collected from at least one of a communication-connected external server, an external database, and a user terminal.
경력 업무 데이터의 카테고리 분류 단계(S200)는 전처리된 경력 업무 데이터의 카테고리를 분류하는 단계이다. 예컨대, 경력 업무 데이터의 카테고리 분류 단계(S200)에서, LDA 기법에 기반하여 경력 업무 데이터를 분석하고, 분석 결과에 따라 경력 업무 데이터의 카테고리를 분류할 수 있다. 카테고리는 동일한 주제 범위를 가지는 경력 업무 데이터의 단어들로 구성될 수 있다.The category classification step (S200) of career work data is a step of classifying the categories of preprocessed career work data. For example, in the category classification step (S200) of career work data, the career work data may be analyzed based on the LDA technique, and the career work data categories may be classified according to the analysis results. Categories may be composed of words from career work data that have the same subject range.
경력 업무 데이터의 감성 분석 단계(S300)는 경력 업무 데이터에 대해 긍정 및 부정 중 하나로 감성 분석을 수행하는 단계이다. 예컨대, 경력 업무 데이터의 감성 분석 단계(S300)에서 경력 업무 데이터에 대한 주관성 탐지 기법을 수행하여 경력 업무 데이터 중 사람의 주관이 나타난 부분만을 추출하고, 경력 업무 데이터 중 사람의 주관이 나타난 부분에 대한 극성 탐지 기법을 수행하여 사람의 주관이 나타난 부분에 대한 긍정 및 부정 중 하나의 감성으로 분류할 수 있다.The sentiment analysis step (S300) of career work data is a step of performing sentiment analysis on career work data into either positive or negative. For example, in the sentiment analysis step (S300) of career work data, a subjectivity detection technique is performed on the career work data to extract only the portions of the career work data in which human subjectivity appears, and By performing a polarity detection technique, the part where a person's subjectivity appears can be classified into one of positive and negative emotions.
여기서, 주관성 탐지 기법 경력 업무 데이터 중 감성과는 관련이 없는 부분 및 개인 정보를 제거하여, 감성분석에 사용될 요소만을 분류하는 탐지 기법이고, 극성 탐지 기법은 텍스트 속의 긍정 및 부정의 단어를 탐지하여 정량화하고, 통계적 기법을 적용하여 해당 단어가 긍정인지 부정인지에 대한 가중치를 제공하여 문장의 상태를 나타내는 탐지 기법일 수 있다.Here, the subjectivity detection technique is a detection technique that classifies only the elements to be used for emotion analysis by removing parts of career work data that are not related to emotions and personal information, and the polarity detection technique detects and quantifies positive and negative words in the text. It may be a detection technique that indicates the state of the sentence by applying statistical techniques to provide a weight for whether the word is positive or negative.
본 개시가 속하는 기술분야의 통상의 지식을 가진 자는 상술한 설명을 기초로 본 개시의 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 쉽게 변형이 가능하다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해되어야만 한다. 본 개시의 범위는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 개시의 범위에 포함되는 것으로 해석되어야 한다. 본원의 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본원의 범위에 포함되는 것으로 해석되어야 한다.Those skilled in the art to which this disclosure pertains will be able to understand, based on the above description, that the present disclosure can be easily modified into another specific form without changing its technical idea or essential features. Therefore, the embodiments described above should be understood in all respects as illustrative and not restrictive. The scope of the present disclosure is indicated by the patent claims described later, and all changes or modified forms derived from the meaning and scope of the claims and their equivalent concepts should be construed as being included in the scope of the present disclosure. The scope of the present application is indicated by the claims described below rather than the detailed description above, and all changes or modified forms derived from the meaning and scope of the claims and their equivalent concepts should be construed as being included in the scope of the present application.
발명의 실시를 위한 형태는 상술한 발명의 실시를 위한 최선의 형태에 기재된 바와 같다. The form for carrying out the invention is as described in the best form for carrying out the invention described above.
본 발명은 경력 업무 데이터 분석 기술로서 의뢰된 경력 업무 데이터를 자동으로 분석하여 해당 경력자의 평가가 긍정인지 부정인지를 판단하도록 하는 기술에 활용 가능하므로, 산업상 이용가능성을 갖는다.The present invention has industrial applicability because it can be used as a career work data analysis technology to automatically analyze requested career work data and determine whether the evaluation of the career worker is positive or negative.

Claims (12)

  1. 경력 업무 데이터 분석 시스템에 의해 수행되는 경력 업무 데이터 분석 방법에 있어서,In a career work data analysis method performed by a career work data analysis system,
    a) 경력 업무 데이터를 수집하고, 수집된 상기 경력 업무 데이터에 대해 토큰화, 불용어 제거 및 명사 추출을 포함하는 전처리 과정을 수행하는 단계;a) collecting career work data and performing a pre-processing process including tokenization, stopword removal, and noun extraction on the collected career work data;
    b) 전처리된 상기 경력 업무 데이터의 카테고리를 분류하는 단계; 및b) classifying categories of the pre-processed career work data; and
    c) 상기 경력 업무 데이터에 대해 긍정 및 부정 중 하나로 감성 분석을 수행하는 단계를 포함하는, 경력 업무 데이터 분석 방법.c) A method of analyzing career work data, comprising the step of performing sentiment analysis on the career work data as either positive or negative.
  2. 제1항에 있어서,According to paragraph 1,
    상기 a) 단계에서, 상기 경력 업무 데이터 분석 시스템과 통신 연결된 외부 서버, 외부 데이터베이스 및 사용자 단말 중 적어도 하나로부터 상기 경력 업무 데이터를 수집하고,In step a), the career work data is collected from at least one of an external server, an external database, and a user terminal connected to communication with the career work data analysis system,
    상기 경력 업무 데이터는 업무 수행자에 대한 리뷰 데이터를 포함하는, 경력 업무 데이터 분석 방법.A method of analyzing career work data, wherein the career work data includes review data about job performers.
  3. 제1항에 있어서,According to paragraph 1,
    상기 b) 단계에서,In step b) above,
    상기 경력 업무 데이터를 LDA(Latent Dirichlet Allocation)기법에 기반하여 분석하고, 분석 결과에 따른 적어도 하나 이상의 카테고리들 각각의 능력치를 그래프 형태로 나타내는, 경력 업무 데이터 분석 방법.A career work data analysis method that analyzes the career work data based on the LDA (Latent Dirichlet Allocation) technique and displays the ability values of each of at least one or more categories according to the analysis results in the form of a graph.
  4. 제1항에 있어서,According to paragraph 1,
    상기 b) 단계는,In step b),
    상기 카테고리로 분류된 경력 업무 데이터의 재분류 과정을 수행하는 단계를 더 포함하고,Further comprising performing a reclassification process of career work data classified into the above categories,
    상기 재분류 과정은 각 단어 사이 또는 단어와 카테고리 사이의 유사도를 산출하고, 상기 유사도에 기초하여 상기 경력 업무 데이터를 분류하는, 경력 업무 데이터 분석 방법.The reclassification process calculates the similarity between each word or between a word and a category, and classifies the career work data based on the similarity.
  5. 제1항에 있어서,According to paragraph 1,
    상기 c) 단계는,In step c),
    c-1) 상기 경력 업무 데이터에 대한 주관성 탐지 기법을 수행하여, 상기 경력 업무 데이터 중 사람의 주관이 나타난 부분만을 추출하는 단계; 및c-1) performing a subjectivity detection technique on the career work data to extract only the portion in which a person's subjectivity appears among the career work data; and
    c-2) 상기 경력 업무 데이터 중 사람의 주관이 나타난 부분에 대한 극성 탐지 기법을 수행하여, 상기 사람의 주관이 나타난 부분에 대한 긍정 및 부정 중 하나의 감성으로 분류하는 단계를 포함하는, 경력 업무 데이터 분석 방법.c-2) a step of performing a polarity detection technique on the part where a person's subjectivity appears among the career work data, and classifying the part where the person's subjectivity appears as one of positive and negative emotions, career work. Data analysis methods.
  6. 제5항에 있어서,According to clause 5,
    상기 주관성 탐지 기법은 상기 경력 업무 데이터 중 감성과는 관련이 없는 부분 및 개인 정보를 제거하여, 감성분석에 사용될 요소만을 분류하는 탐지 기법이고,The subjectivity detection technique is a detection technique that classifies only the elements to be used for emotional analysis by removing parts of the career work data that are not related to emotion and personal information,
    상기 극성 탐지 기법은 텍스트 속의 긍정 및 부정의 단어를 탐지하여 정량화하고, 상기 긍정 및 부정을 나타내는 가중치를 적용하여 상기 텍스트를 포함하는 문장이 긍정인지 부정인지 판단하는 탐지 기법인, 경력 업무 데이터 분석 방법.The polarity detection technique is a detection technique that detects and quantifies positive and negative words in a text and applies weights representing the positive and negative words to determine whether a sentence containing the text is positive or negative, a career work data analysis method. .
  7. 통신 모듈;communication module;
    적어도 하나의 프로세서; 및at least one processor; and
    상기 프로세서와 전기적으로 연결되고, 상기 프로세서에서 수행되는 적어도 하나의 코드(code)가 저장되는 메모리를 포함하고,A memory electrically connected to the processor and storing at least one code executed by the processor,
    상기 메모리는 상기 프로세서를 통해 실행될 때 상기 프로세서가,When the memory is executed through the processor, the processor
    경력 업무 데이터를 수집하고, 수집된 상기 경력 업무 데이터에 대해 토큰화, 불용어 제거 및 명사 추출을 포함하는 전처리 과정을 수행하고, 전처리된 상기 경력 업무 데이터의 카테고리를 분류하며, 상기 경력 업무 데이터에 대해 긍정 및 부정 중 하나로 감성 분석을 수행하도록 야기하는 코드를 저장하는, 경력 업무 데이터 분석 시스템.Collect career work data, perform a preprocessing process including tokenization, stopword removal, and noun extraction on the collected career work data, classify the categories of the preprocessed career work data, and A career work data analysis system that stores code that causes sentiment analysis to be performed as either positive or negative.
  8. 제7항에 있어서,In clause 7,
    상기 메모리는 상기 프로세서로 하여금,The memory allows the processor to:
    상기 경력 업무 데이터 분석 시스템과 통신 연결된 외부 서버, 외부 데이터베이스 및 사용자 단말 중 적어도 하나로부터 상기 경력 업무 데이터를 수집하도록 야기하는 코드를 저장하고,storing code that causes the career work data to be collected from at least one of an external server, an external database, and a user terminal that are in communication with the career work data analysis system;
    상기 경력 업무 데이터는 업무 수행자에 대한 리뷰 데이터를 포함하는, 경력 업무 데이터 분석 시스템.A career work data analysis system wherein the career work data includes review data about job performers.
  9. 제7항에 있어서,In clause 7,
    상기 메모리는 상기 프로세서로 하여금,The memory allows the processor to:
    상기 경력 업무 데이터를 LDA(Latent Dirichlet Allocation)기법에 기반하여 분석하고, 분석된 결과에 따른 적어도 하나 이상의 카테고리들 각각의 능력치를 그래프 형태로 나타내는, 경력 업무 데이터 분석 시스템.A career work data analysis system that analyzes the career work data based on the LDA (Latent Dirichlet Allocation) technique and displays the ability values of each of at least one or more categories according to the analysis results in the form of a graph.
  10. 제7항에 있어서,In clause 7,
    상기 메모리는 상기 프로세서로 하여금,The memory allows the processor to:
    상기 카테고리로 분류된 경력 업무 데이터의 재분류 과정을 수행하도록 야기하는 코드를 저장하고, storing a code that causes a reclassification process of career work data classified into the above categories to be performed;
    상기 재분류 과정은 각 단어 사이 또는 단어와 카테고리 사이의 유사도를 산출하고, 상기 유사도에 기초하여 상기 경력 업무 데이터를 분류하는, 경력 업무 데이터 분석 시스템.The reclassification process calculates the similarity between each word or between a word and a category, and classifies the career work data based on the similarity.
  11. 제7항에 있어서,In clause 7,
    상기 메모리는 상기 프로세서로 하여금,The memory allows the processor to:
    상기 경력 업무 데이터에 대한 주관성 탐지 기법을 수행하여, 상기 경력 업무 데이터 중 사람의 주관이 나타난 부분만을 추출하고, 상기 경력 업무 데이터 중 사람의 주관이 나타난 부분에 대한 극성 탐지 기법을 수행하여, 상기 사람의 주관이 나타난 부분에 대한 긍정 및 부정 중 하나의 감성으로 분류하도록 야기하는 코드를 저장하는, 경력 업무 데이터 분석 시스템.By performing a subjectivity detection technique on the career work data, extracting only the portion in which the person's subjectivity appears among the career work data, and performing a polarity detection technique on the portion in which the person's subjectivity appears in the career work data, the person A career work data analysis system that stores codes that cause the subjective feelings to be classified as either positive or negative.
  12. 제11항에 있어서,According to clause 11,
    상기 주관성 탐지 기법은 상기 경력 업무 데이터 중 감성과는 관련이 없는 부분 및 개인 정보를 제거하여, 감성분석에 사용될 요소만을 분류하는 탐지 기법이고,The subjectivity detection technique is a detection technique that classifies only the elements to be used for emotional analysis by removing parts of the career work data that are not related to emotion and personal information,
    상기 극성 탐지 기법은 텍스트 속의 긍정 및 부정의 단어를 탐지하여 정량화하고, 상기 긍정 및 부정을 나타내는 가중치를 적용하여 상기 텍스트를 포함하는 문장이 긍정인지 부정인지 판단하는 탐지 기법인, 경력 업무 데이터 분석 시스템.The polarity detection technique is a detection technique that detects and quantifies positive and negative words in a text and applies weights representing the positive and negative words to determine whether a sentence containing the text is positive or negative, a career work data analysis system. .
PCT/KR2023/008436 2022-06-20 2023-06-19 Method and system for analyzing work experience data WO2023249345A1 (en)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
KR10-2022-0074871 2022-06-20
KR20220074871 2022-06-20
KR20220074907 2022-06-20
KR10-2022-0074907 2022-06-20
KR10-2023-0077837 2023-06-19
KR1020230077837A KR20230174179A (en) 2022-06-20 2023-06-19 Apparatus and method for generating an avatar face for virtual fashion style fitting

Publications (1)

Publication Number Publication Date
WO2023249345A1 true WO2023249345A1 (en) 2023-12-28

Family

ID=89377916

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2023/008436 WO2023249345A1 (en) 2022-06-20 2023-06-19 Method and system for analyzing work experience data

Country Status (2)

Country Link
KR (1) KR20230174179A (en)
WO (1) WO2023249345A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20180127622A (en) * 2017-12-07 2018-11-29 최윤진 Systems for data collection and analysis
US10303771B1 (en) * 2018-02-14 2019-05-28 Capital One Services, Llc Utilizing machine learning models to identify insights in a document
KR20200007713A (en) * 2018-07-12 2020-01-22 삼성전자주식회사 Method and Apparatus for determining a topic based on sentiment analysis
KR20210029006A (en) * 2019-09-05 2021-03-15 군산대학교산학협력단 Product Evolution Mining Method And Apparatus Thereof
KR20210044017A (en) * 2019-10-14 2021-04-22 한양대학교 산학협력단 Product review multidimensional analysis method and apparatus

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20180127622A (en) * 2017-12-07 2018-11-29 최윤진 Systems for data collection and analysis
US10303771B1 (en) * 2018-02-14 2019-05-28 Capital One Services, Llc Utilizing machine learning models to identify insights in a document
KR20200007713A (en) * 2018-07-12 2020-01-22 삼성전자주식회사 Method and Apparatus for determining a topic based on sentiment analysis
KR20210029006A (en) * 2019-09-05 2021-03-15 군산대학교산학협력단 Product Evolution Mining Method And Apparatus Thereof
KR20210044017A (en) * 2019-10-14 2021-04-22 한양대학교 산학협력단 Product review multidimensional analysis method and apparatus

Also Published As

Publication number Publication date
KR20230174179A (en) 2023-12-27

Similar Documents

Publication Publication Date Title
Chen et al. A Two‐Step Resume Information Extraction Algorithm
US20200193382A1 (en) Employment resource system, method and apparatus
WO2020253503A1 (en) Talent portrait generation method, apparatus and device, and storage medium
Park Developing a COVID-19 crisis management strategy using news media and social media in big data analytics
CN109299865B (en) Psychological evaluation system and method based on semantic analysis and information data processing terminal
WO2023096254A1 (en) Artificial intelligence-based job matching system
CN110727852A (en) Method, device and terminal for pushing recruitment recommendation service
Sutton et al. Biased embeddings from wild data: Measuring, understanding and removing
WO2013002436A1 (en) Method and device for ontology-based document classification
WO2015093914A1 (en) Method, server, and computer-readable recording medium for providing online mentoring
CN112052396A (en) Course matching method, system, computer equipment and storage medium
WO2023106855A1 (en) Method, system and non-transitory computer-readable recording medium for supporting writing assessment
Wang et al. Analysing CV corpus for finding suitable candidates using knowledge graph and BERT
WO2020111827A1 (en) Automatic profile generation server and method
WO2023249345A1 (en) Method and system for analyzing work experience data
Chu et al. Distribution of Large‐Scale English Test Scores Based on Data Mining
Niekler et al. ILCM-a virtual research infrastructure for large-scale qualitative data
WO2017179778A1 (en) Search method and apparatus using big data
KR102252096B1 (en) System for providing bigdata based minutes process service
CN113204644A (en) Government affair encyclopedia construction method based on knowledge graph
WO2022085823A1 (en) Device and method for generating positioning map using topic modeling technique
KR102671618B1 (en) Method and system for providing user-customized interview feedback for educational purposes based on deep learning
WO2016076622A1 (en) Guideline providing method depending on document selection, computer-readable recording medium in which program for performing same is recorded, and application for terminal device, stored in medium
KR20180137394A (en) A device for extracting and managing terms from a document and a method for extracting and managing terms using the same
WO2023191317A1 (en) Method, device, and computer-readable recording medium for monitoring risk or opportunity event related to user-customized topic through deep signal

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23827461

Country of ref document: EP

Kind code of ref document: A1