CN109299865A - Psychological assessment system and method, information data processing terminal based on semantic analysis - Google Patents
Psychological assessment system and method, information data processing terminal based on semantic analysis Download PDFInfo
- Publication number
- CN109299865A CN109299865A CN201811034545.9A CN201811034545A CN109299865A CN 109299865 A CN109299865 A CN 109299865A CN 201811034545 A CN201811034545 A CN 201811034545A CN 109299865 A CN109299865 A CN 109299865A
- Authority
- CN
- China
- Prior art keywords
- text
- semantic
- module
- word
- information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
- G06Q10/06393—Score-carding, benchmarking or key performance indicator [KPI] analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/26—Government or public services
Abstract
The invention belongs to computer software technical fields, disclose psychological assessment system and method, information data processing terminal based on semantic analysis, including module, data acquisition module, data processing module, corpus and norm module, semantic network computing module and visual feedback module etc. is presented in information and parameter setting module, exam pool and topic.By semantic network, the methods of theoretical, text mining is applied to creativity assessment to the present invention, and develops the test and evaluation software for being suitable for online assessment with computer score, is better than traditional paper pencil test to a certain extent.Obtain experimental evidence support, individual evaluation is based on large sample norm data, assessment is succinct and friendly, it can be widely used in the scientific research of creativity, the innovative education of school, the choice of Corporation R & D personnel and assessment etc., provide support to serve the construction of " innovation-oriented country ".
Description
Technical field
The invention belongs to computer software technical fields, more particularly to psychological assessment system and side based on semantic analysis
Method, information data processing terminal.
Background technique
Currently, the prior art commonly used in the trade is such thatInnovation is national science and technology progress, and the source of economic development is
Ideal adaptation life, the basis solved the problems, such as.China pushing forward comprehensively innovation-oriented country construction, wherein innovation talent education and
It cultivates most important.And how the innovation ability of accurate evaluation and prediction individual is particularly important in education and cultivation link.
The method of the creative potential assessment of individual at present can substantially be divided into three kinds: paper pencil test, works are evaluated and Subjective Reports.Paper pen is surveyed
Testing is test form the most popular, main measuring tool include divergent thinking test, Torrance Test of Creative Thinking with
And convergent thinking test etc..But mostly based on the cube theory of intelligence, External Validity is controversial for this kind of test.In addition, traditional
Creating force estimation, there is also following deficiencies: first is that, the processing fluency and flexibility capability of the more prominent individual of this kind of test, because
This has sheltered the central trait of individual creativity: remote semanteme attachment force.Second is that time-histories of traditional paper pencil test to subject
Control is insufficient, and operation sequence is complicated, and scoring takes time and effort, therefore tests feedback and make it difficult to be applied in practice slowly;Third is that
No matter divergent thinking test or works evaluation, it is all horizontal by the experience of scoring person or expert and their itself creativity
It influences, therefore it is subjective to score;Fourth is that traditional questionnaire such as measures at the Subjective Reports mode, society's praise effect not can avoid
It should be with the subjectivity of individual itself.In addition, the core of creativity is creative thinking cognitive ability, questionnaire measurement, which can not be involved in, to be recognized
Know process.
In conclusion the main problem of traditional creativity assessment technology includesAppraisal procedure is complicated, assessment subjectivity is strong, applies
It surveys at high cost, causes to be difficult in applications such as school eduaction, enterprise selection of talented people.
Based on this, the present invention combines the Semantic Similarity in semantic stratification theory and prototype inspiration theoretical, and is based on language
The creative method of justice distance assessment, proposition calculate semantic distance and Semantic Similarity and structure by natural language processing technique
Semantic network is built, thus the objectively creative potential of quantization individual.Text library is depended on since traditional semantic distance calculates, and is showed
The basis for having text library whether to can be used as individual creation force estimation need to be considered, therefore the present invention is based on the hairs established early period
The text library for dissipating thinking, convergent thinking, scientific invention, story creation test, guarantees the validity of evaluation index with this.In addition,
By computer aided technique, in conjunction with Text Mining Technology and computer interaction technique, Development Creative Power scores and feeds back online
System.
Overcome many and diverse operation of traditional assessment method, multi-user can by the media such as computer, mobile phone at any time with
Ground online evaluation individual innovation ability, therefore the test being applicable in extensive innovative education scene, can be used for company
The human resources of enterprise are tested and assessed.In addition, the algorithm routine that the system carries can quickly provide user feedback, conventional measurement is overcome
Comment the assessment component taken time and effort.Most importantly the evaluation scheme based on semantic distance and semantic network, can be more objective
The innovation potential of the reaction individual of sight.
Summary of the invention
In view of the problems of the existing technology, the present invention provides based on semantic analysis psychological assessment system and method,
Information data processing terminal.
It is described based on semantic analysis the invention is realized in this way a kind of psychological assessment system based on semantic analysis
Psychological assessment system includes:
Essential information and parameter setting module, for realizing acquisition user's essential information, according to the age of typing and
It goes through information and generates matched exam pool, corresponding topic is generated according to assessment purpose;
Module is presented in exam pool and topic, stores all types of topics and its corresponding instruction for having the function of, is joined
The regulation of number setup module;
Data acquisition module collects user's reaction information for having the function of;
Data preprocessing module acquires the input text with text transcription module for receiving data, and carries out to text
Participle, analyzes the noun, adjective, verb of text;
Corpus and norm module for working out according to divergent thinking, scientific invention material depot, and collect 500 subjects
Answer summarized and carry out the creative corpus established after text analyzing;
Semantic network computing module, for being measured to new input text progress semantic distance calculating and semantic network;
Visual feedback module, for providing result feedback and the function of suggesting for user.
Further, the data acquisition module includes:
Text is answered unit, for inputting solution or mobile phone typing solution for computor-keyboard, for can not key
The user for entering text can substitute input with other people;
Voice is answered unit, for the oral answer to acquire user, and is collected as voice messaging, and uses open source
Voice messaging is transcribed into text information and stored by speech recognition software.
Further, the data preprocessing module includes:
Word segmentation module using stammerer Chinese word segmentation library and is utilized using Python according to natural language processing method
Two-way maximum matching method and mark scanning method, identification and cutting character string to be analyzed in have obvious characteristic word, using word as
Former character string is divided into lesser string and carries out mechanical Chinese word segmentation again by breakpoint.
The corpus and norm module further comprise:
Answer is with respect to assignment unit, for the answer assignment in corpus;
Semantic assignment unit, for according to the semantic distance for calculating any 2 words with the text corpus established.
The semantic network computing module further comprises:
Semantic network unit is node by word, and semantic distance is that side constitutes matrix between word, calculates the degree of word, i.e.,
Significance level of some word in semantic network;
The cluster coefficient elements of network are used for descriptive semantics network group degree;
Semantic distance algorithm unit, for carrying out the building of vector according to LSA and Word2vec and TF-IDF algorithm word,
And cosine similarity is combined, semantic vector space is constructed, the word for acquiring text is similar.
Another object of the present invention is to provide a kind of psychological assessment system described in application based on semantic analysis based on
The Psychological Evaluation method of semantic analysis, the Psychological Evaluation method based on semantic analysis include:
Setting basic parameter: need to carry out basic parameter setting, administrator's login system back-stage management when system initialization
Behind interface, according to instruction, the setting such as dictionary for word segmentation, the setting of topic label, the setting of exam pool matching rule, algorithm correlation are completed
Threshold parameter setting etc..Later period can as the case may be modify to parameter.
Acquire user's essential information: the typing mode of information is divided into administrator backstage batch and imports with user front end certainly
Two ways is filled in row registration: batch imports, and for the typing of the essential information obtained, automatically generates user's after importing
Account information distributes to user and carries out system login;It voluntarily extends this as user and voluntarily fills in personal base after system registry
This information.Data persistence is carried out in the information deposit background data base of typing.
The typing that exam pool includes: (1) topic is generated, the topic of creativity assessment is with creativity correlation theory for according to volume
System.Topic information is divided into batch importing and the page is filled in typing two ways by carrying out typing by backstage manager;Batch is led
Enter is to be uploaded in system database in the case where data volume is more by the document of the formats such as Excel table, Word, TXT;
The page is filled in, then is directly to fill in item content information in the page by administrator in the lesser situation of data volume.Root when typing
According to the factors such as test purpose and test question type, label setting and classification are carried out to topic and saved in the database.Background system
Offer the additions and deletions of existing topic are changed look into and attachment upload etc. functions.(2) test database generation is believed according to the age of typing and educational background
Breath finds the topic label to match according to matching rule, generates exam pool essential information (exam pool label, Templates etc.), deposits
Enter in database.According to assessment purpose, in problem data table search corresponding types label topic, and with exam pool label according to
Certain rule is associated, and generates exam pool particular content;
Data of testing and assessing acquisition: after user enters system, according to the essential information and test purpose of user.Directly inscribing
Corresponding exam pool is selected for user automatically in the table of library, is presented on the page.User is according to instruction typing test content (packet
Include the forms such as text, voice), information is submitted be saved in database by background program after the completion;
Data prediction: 1, text transcription.It is text data using text transcription tool change for non-text data,
And it is saved in urtext tables of data.2, text segments.Rough segmentation word is carried out to urtext data using stammerer participle tool,
The non-Chinese symbol such as punctuation mark is filtered out, is saved into txt formatted file.File directory information saves in the database.3, word
Property mark.It to the data that first time participle obtains, is cleaned again, filters out stop words, part of speech mark is carried out to remaining vocabulary
Note.And the information is stored in txt file by certain format.File directory information saves in the database.4, keyword mentions
It takes.Weight is calculated using the reverse document-frequency building method of TF-IDF word frequency-.According to the keyword threshold value of setting, extract all
Keyword saves key word information (weight, part of speech etc.) in the database.
Building of corpus: the corpus is divided into two parts.First is that being worked out according to divergent thinking, scientific invention material depot, base
Summarized and carried out the creative corpus established after text analyzing in the answer that early period collects 500 subjects, is for initial stage
System carries out the corpus basis of creativity assessment.It is imported into database in system initialization, the later period is with test sample
Expand, which also updates therewith expands.Second is that crawling Baidupedia, Chinese Wiki hundred with Python crawler for this research
The resulting large-scale text data in the websites such as section, microblogging.Two parts corpus complements each other, at the same meet the capacity of corpus with
And applicability feature.
Semantic network building: word is constructed using neural network and Skip-Gram at deep learning frame Tensorflow
Vector;By LSA latent semantic analysis algorithm, the document that the vector space model (VSM) of higher-dimension indicates is passed through to item/document
Matrix singular value decomposition (SVD) is mapped to progress text vector building in the latent semantic space of low-dimensional, while obtaining word
Between semantic similarity, thus construct word semantic network matrix.The semantic network being calculated is saved as into xml format
Document, and index is established in the database.
Scoring calculates: using Python by the keyword of extraction and this research in advance established original language material library into
Row matching obtains primary scoring.If corpus can not occurrence Numpy equal matrix taken according to the semantic network that builds
Library is analyzed, latent semantic analysis is carried out, calculate answer and expects the similarity in library, and similarity assigns answer certain power accordingly
Value scores.
Another object of the present invention is to provide the calculating of a kind of Psychological Evaluation method described in realize based on semantic analysis
Machine program.
Another object of the present invention is to provide the information of a kind of Psychological Evaluation method described in realize based on semantic analysis
Data processing terminal.
Another object of the present invention is to provide a kind of computer readable storage mediums, including instruction, when it is in computer
When upper operation, so that computer executes the Psychological Evaluation method based on semantic analysis.
In conclusion advantages of the present invention and good effect are:
Frontier nature the related art method on natural language processing direction is applied and is created traditional based on semanteme
Property thinking assessment on.Semantic text is analyzed and processed on from the technological means of semantic network and latent semantic analysis, is built
Vertical creativeness corpus, constructs Spatial Semantics network model, to carry out the assessment of creative thinking.At the natural language of use
Reason technology is shown in Table 1:
Table 1 uses technology and effect
Semantic distance and Semantic Similarity are calculated by natural language processing technique and constructs semantic network, thus scientific
The creative potential of quantitative evaluation individual method (see Fig. 1).In conjunction with Text Mining Technology and computer interaction application method,
Development Creative Power scores online and feedback system.Improve the operability and objectivity of creative thinking assessment.It overcomes simultaneously
Traditional scoring complex steps and the excessively high disadvantage of subjectivity.The accuracy of assessment is improved, while manpower object can be saved
Power, for user provide conveniently, accurate, the objective and creative ability index that timely feedbacks.
Experiments verify that comparing traditional creative thinking assessment, the creativeness established based on natural language processing technique
There are following advantages (tables 2) for thinking appraisal model:
Table 2 is based on the advantages of natural language processing technique assessment creative thinking
To verify above-mentioned achievement, 400 university students is randomly selected and have been tested participation experiment.Traditional creative thinking is surveyed
Comment with based on natural language processing technique establish creative thinking appraisal model measured by achievement simultaneously with the creativeness of subject
Two indexs of behavior and creative achievement carry out Pearson correlation analysis and multiple linear regression analysis, data result
It proves:
The related coefficient of the creative thinking appraisal model and two indexs established based on natural language processing technique and
Predictive validity is above traditional artificial assessment.
Detailed description of the invention
Fig. 1 is the psychological assessment system structural schematic diagram provided in an embodiment of the present invention based on semantic analysis;
In figure: 1, essential information and parameter setting module;2, module is presented in exam pool and topic;3, data acquisition module;4,
Data preprocessing module;5, corpus and norm module;6, semantic network computing module;7, visual feedback module.
Fig. 2 is the Psychological Evaluation method flow diagram provided in an embodiment of the present invention based on semantic analysis.
Fig. 3 is that the psychological assessment system provided in an embodiment of the present invention based on semantic analysis realizes structural schematic diagram.
Fig. 4 is creativity index system schematic diagram provided in an embodiment of the present invention.
Fig. 5 is test result feedback schematic diagram provided in an embodiment of the present invention.
Fig. 6 is semantic network effect picture provided in an embodiment of the present invention.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to embodiments, to the present invention
It is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not used to
Limit the present invention.
The invention reside in the assessment of traditional creativity is overcome the shortcomings of, a kind of succinct convenience, the letter good creation of validity are designed
Power online exam system is assessed and is selected for the talent, teenager innovates potential assessment and innovative education.
Application principle of the invention is explained in detail with reference to the accompanying drawing.
As shown in Figure 1, the psychological assessment system provided in an embodiment of the present invention based on semantic analysis include: essential information and
Module 2, data acquisition module 3, data preprocessing module 4, corpus and norm is presented in parameter setting module 1, exam pool and topic
Module 5, semantic network computing module 6, visual feedback module 7.
Essential information and parameter setting module 1, for realizing acquisition user's essential information, according to the age of typing and
It goes through information and generates matched exam pool, corresponding topic is generated according to assessment purpose.
Module 2 is presented in exam pool and topic, stores all types of topics and its corresponding instruction for having the function of, is joined
The regulation of number setup module.
Data acquisition module 3 collects user's reaction information for having the function of.
Data preprocessing module 4 acquires the input text with text transcription module for receiving data, and carries out to text
Participle, analyzes the noun, adjective, verb of text.
Corpus and norm module 5 for working out according to divergent thinking, scientific invention material depot, and collect 500 subjects
Answer summarized and carry out the creative corpus established after text analyzing.
Semantic network computing module 6, for being measured to new input text progress semantic distance calculating and semantic network.
Visual feedback module 7, for providing result feedback and the function of suggesting for user.
As shown in Fig. 2, the Psychological Evaluation method provided in an embodiment of the present invention based on semantic analysis the following steps are included:
S101: acquisition user's essential information generates matched exam pool according to the age of typing and academic information,
Corresponding topic is generated according to assessment purpose;
S102: storing the function of all types of topics and its corresponding instruction, the regulation by parameter setting module;
S103: user's reaction information is collected;
S104: the input text of data acquisition and text transcription module is received, and text is segmented, to the name of text
Word, adjective, verb are analyzed;
S105: it is worked out according to divergent thinking, scientific invention material depot, and the answer for collecting 500 subjects is summarized simultaneously
Carry out the creative corpus established after text analyzing;
S106: semantic distance calculating is carried out to new input text and semantic network is measured;
S107: result feedback is provided for user and is suggested.
In a preferred embodiment of the invention: essential information and parameter setting module 1, it is main to realize that acquisition user is basic
Information, such as age, gender, educational background, assessment purpose and contact method.Phase therewith is generated according to the age of typing and academic information
Then matched exam pool generates corresponding topic according to assessment purpose.
In a preferred embodiment of the invention: data acquisition module 2 has the function of collecting user's reaction information, make
User can be answered with text and be answered with voice;Text, which is answered, inputs solution or mobile phone typing solution for computor-keyboard,
User for that can not key in text can substitute input with other people;Voice is answered to acquire the oral answer of user, and is received
Integrate as voice messaging, and voice messaging is transcribed into text information using the speech recognition software of open source and is stored.
In a preferred embodiment of the invention: the reception data acquisition of data preprocessing module 3 is defeated with text transcription module
Enter text, and text is segmented, mainly the noun of text, adjective, verb are analyzed.Wherein word segmentation module utilizes
Python, according to the correlation theory of natural language processing (NLP), using stammerer Chinese word segmentation library maximum two-way to utilization
Method with method and binding characteristic scanning, identifies and is syncopated as some words with obvious characteristic in character string to be analyzed, with
Former character string is divided into lesser string and carries out mechanical Chinese word segmentation again, reduce matched error rate by these words as breakpoint.Such as " change
Adornment and clothes " are segmented into " makeup and clothes " rather than " makeup and clothes ".
In a preferred embodiment of the invention: corpus and 5 a part of norm module are established in 500 proper manners database bases
On plinth, summarize according to the answer that divergent thinking, scientific invention material depot are worked out, and collect 500 subjects and go forward side by side style of writing originally
The creative corpus established after analysis;Wherein another part is to crawl the resulting large-scale textual data of Baidupedia with Python
According to;Two parts corpus complements each other, while meeting the capacity and applicability feature of corpus;Corpus and norm module
There are two types of functions for 5 tools, first is that answer with respect to assignment, such as topic vocabulary (brick) includes 100 answer dictionaries, wherein
Answer " chalk " is assigned a value of 3 points in 500 people's corpus.Second is that semantic assignment, calculates any 2 according to the text corpus of foundation
The semantic distance of a word.
In a preferred embodiment of the invention: semantic network computing module 6, which has, carries out semantic distance to new input text
Calculate the function of being measured with semantic network.It is node (node) that semantic network, which is by word, and semantic distance is side between word
(edge) matrix constituted.Matrix can calculate the degree (degree) of word, i.e. significance level of some word in semantic network;
The cluster coefficient (clustering coefficients) of network, for describing semantic network grouping of the world economy degree.Constellation
Number refers to a possibility that neighbor node of set node is interconnected.The cluster coefficient of node is bigger, indicate node with neighbour
The importance degree occupied in the network of node composition is lower.Specifically, the neighbor node connection of set node is closer, if
Big influence will not then be had to the connectivity of neighbor node by removing the node.Using TF-IDF algorithm, calculate separately word frequency TF and
Inverse document frequency IDF, the low file frequency of high term frequencies and the word in entire file set in a certain specific file
Rate can produce out the TF-IDF of high weight.Therefore, TF-IDF tends to filter out common word, retains important word,
Extract keyword.Semantic distance algorithm carries out the building of vector according to LSA and Word2vec and TF-IDF algorithm word, and combines
Cosine similarity, the methods of skip-gram construct semantic vector space, and the word for acquiring text is similar
Visual feedback module 7 has and provides result feedback and the function of suggesting for user.As a result feedback includes using
Score of the person in each dimension of creativity, total score, and ranking etc. in group.Learning direction can be provided according to scoring systems
Selection, occupation choice and the suggestion for promoting creativity.
Psychological assessment system provided in an embodiment of the present invention based on semantic analysis be installed on user terminal (for computer,
The terminal devices such as mobile phone, tablet computer) on, it is used to provide user's interface of access by visualization interface, user can pass through
The visualization interface login system platform of user terminal carries out on-line testing, and can check report and feedback knot after a test
Fruit.
In the above-described embodiments, can come wholly or partly by software, hardware, firmware or any combination thereof real
It is existing.When using entirely or partly realizing in the form of a computer program product, the computer program product include one or
Multiple computer instructions.When loading on computers or executing the computer program instructions, entirely or partly generate according to
Process described in the embodiment of the present invention or function.The computer can be general purpose computer, special purpose computer, computer network
Network or other programmable devices.The computer instruction may be stored in a computer readable storage medium, or from one
Computer readable storage medium is transmitted to another computer readable storage medium, for example, the computer instruction can be from one
A web-site, computer, server or data center pass through wired (such as coaxial cable, optical fiber, Digital Subscriber Line (DSL)
Or wireless (such as infrared, wireless, microwave etc.) mode is carried out to another web-site, computer, server or data center
Transmission).The computer-readable storage medium can be any usable medium or include one that computer can access
The data storage devices such as a or multiple usable mediums integrated server, data center.The usable medium can be magnetic Jie
Matter, (for example, floppy disk, hard disk, tape), optical medium (for example, DVD) or semiconductor medium (such as solid state hard disk Solid
State Disk (SSD)) etc..
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention
Made any modifications, equivalent replacements, and improvements etc., should all be included in the protection scope of the present invention within mind and principle.
Claims (10)
1. a kind of psychological assessment system based on semantic analysis, which is characterized in that the Psychological Evaluation system based on semantic analysis
System includes:
Essential information and parameter setting module, for realizing acquisition user's essential information, according to the age of typing, educational background and duty
Industry information generates matched exam pool, generates corresponding topic according to assessment purpose;
Module is presented in exam pool and topic, stores all types of topics and its corresponding instruction for having the function of, is set by parameter
Set the regulation of module;
Data acquisition module collects user's reaction information for having the function of;
Data preprocessing module acquires the input text with text transcription module for receiving data, and segments to text,
The function that the noun, adjective, verb of text are analyzed;
Corpus and norm module for working out according to divergent thinking, scientific invention material depot, and collect answering for 500 subjects
Case is summarized and carries out the creative corpus established after text analyzing;
Semantic network computing module, for being measured to new input text progress semantic distance calculating and semantic network;
Visual feedback module, for providing result feedback and the function of suggesting for user.
2. as described in claim 1 based on the psychological assessment system of semantic analysis, which is characterized in that the data acquisition module
Include:
Text is answered unit, for inputting solution or mobile phone typing solution for computor-keyboard, for that can not key in text
The user of word can substitute input with other people;
Voice is answered unit, for the oral answer to acquire user, and is collected as voice messaging, and uses the voice of open source
Voice messaging is transcribed into text information and stored by identification software.
3. as described in claim 1 based on the psychological assessment system of semantic analysis, which is characterized in that the data prediction mould
Block includes:
Word segmentation module using stammerer Chinese word segmentation library and is utilized two-way using Python according to natural language processing method
Maximum matching method and mark scanning method, identification and the word that obvious characteristic is had in cutting character string to be analyzed, using word as breakpoint,
Former character string is divided into lesser string and carries out mechanical Chinese word segmentation again.
4. as described in claim 1 based on the psychological assessment system of semantic analysis, which is characterized in that the corpus and norm
Module further comprises:
Answer is with respect to assignment unit, for the answer assignment in corpus;
Semantic assignment unit, for according to the semantic distance for calculating any 2 words with the text corpus established.
5. as described in claim 1 based on the psychological assessment system of semantic analysis, which is characterized in that the semantic network calculates
Module further comprises:
Semantic network unit is node by word, and semantic distance is that side constitutes matrix between word, calculates the degree of word, i.e., some
Significance level of the word in semantic network;
The cluster coefficient elements of network are used for descriptive semantics network group degree;
Semantic distance algorithm unit for carrying out the building of vector according to LSA and Word2vec and TF-IDF algorithm word, and is tied
Cosine similarity is closed, semantic vector space is constructed, the word for acquiring text is similar.
6. a kind of Psychological Evaluation based on semantic analysis of the psychological assessment system using described in claim 1 based on semantic analysis
Method, which is characterized in that the Psychological Evaluation method based on semantic analysis includes:
User's essential information is acquired, matched exam pool is generated according to the age of typing and academic information, according to assessment
Purpose generates corresponding topic;
The function of storing all types of topics and its corresponding instruction, the regulation by parameter setting module;
Collect the function of user's reaction information;
The input text for receiving data acquisition and text transcription module, and segments text, to the noun of text, describes
Word, verb are analyzed;
It is worked out according to divergent thinking, scientific invention material depot, the reaction for having collected 500 subjects carries out summarizing one's duty of composing a piece of writing of going forward side by side
The creative corpus established after analysis;
Semantic distance calculating is carried out to new input text and semantic network is measured;
Result feedback and the function of suggesting are provided for user.
7. the Psychological Evaluation method based on semantic analysis as claimed in claim 6, which is characterized in that described to be based on semantic analysis
Psychological Evaluation method include:
Basic parameter is set: according to instruction, complete the setting of dictionary for word segmentation, the setting of topic label, the setting of exam pool matching rule,
Algorithm dependent thresholds parameter setting;
Acquire user's essential information: batch imports, and for the typing of the essential information obtained, use is automatically generated after importing
The account information of person distributes to user and carries out system login;User is voluntarily extended this as voluntarily to fill in after system registry
People's essential information;
Exam pool, the typing including topic are generated, the topic of creativity assessment is with creativity correlation theory for according to establishment;Exam pool
It generates, according to the age of typing and academic information, the topic label to match is found according to matching rule, generate exam pool and believe substantially
Breath is stored in database;
Data of testing and assessing acquisition selects automatically for user directly in exam pool table according to the essential information of user and test purpose
Corresponding exam pool, is presented on the page;
Data prediction, text transcription are text data using text transcription tool change for non-text data, and in original
It is saved in beginning text data table;Text participle carries out rough segmentation word to urtext data using stammerer participle tool, filters out mark
The non-Chinese symbol such as point symbol, saves into txt formatted file;Part-of-speech tagging carries out again the data that first time participle obtains
Cleaning, filters out stop words, carries out part-of-speech tagging to remaining vocabulary;And the information is stored in txt file by certain format;
File directory information saves in the database;Keyword extraction calculates power using the reverse document-frequency building method of TF-IDF word frequency-
Weight, according to the keyword threshold value of setting, extracts all keywords, and key word information is saved in the database;
Building of corpus is worked out according to divergent thinking, scientific invention material depot, and the answer for having collected 500 subjects early period carries out
Summarize and carries out the creative corpus established after text analyzing;With Python crawler crawl Baidupedia, Chinese wikipedia,
The resulting large-scale text data in microblogging website;
Semantic network building, constructs term vector using neural network and Skip-Gram at deep learning frame Tensorflow;
By LSA latent semantic analysis algorithm, the document that the vector space model of higher-dimension indicates is passed through unusual to item/document matrix
Value decomposition is mapped to progress text vector building in the latent semantic space of low-dimensional, while it is similar to obtain the semanteme between word
Degree;The semantic network being calculated is saved as into xml format file, and establishes index in the database;
Scoring calculates, and match obtaining with established original language material library in advance by the keyword of extraction using Python
Primary scoring;If corpus can not occurrence take Numpy equal matrix to analyze library according to the semantic network that builds, carry out
Latent semantic analysis calculates answer and expects the similarity in library, and the certain weight of similarity imparting answer scores accordingly.
8. a kind of computer program for realizing the Psychological Evaluation method based on semantic analysis described in claim 6~7.
9. a kind of information data processing terminal for realizing the Psychological Evaluation method based on semantic analysis described in claim 6~7.
10. a kind of computer readable storage medium, including instruction, when run on a computer, so that computer executes such as
Psychological Evaluation method described in claim 6~7 based on semantic analysis.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811034545.9A CN109299865B (en) | 2018-09-06 | 2018-09-06 | Psychological evaluation system and method based on semantic analysis and information data processing terminal |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811034545.9A CN109299865B (en) | 2018-09-06 | 2018-09-06 | Psychological evaluation system and method based on semantic analysis and information data processing terminal |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109299865A true CN109299865A (en) | 2019-02-01 |
CN109299865B CN109299865B (en) | 2021-12-17 |
Family
ID=65166131
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811034545.9A Active CN109299865B (en) | 2018-09-06 | 2018-09-06 | Psychological evaluation system and method based on semantic analysis and information data processing terminal |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109299865B (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109903096A (en) * | 2019-03-05 | 2019-06-18 | 上海硕恩网络科技股份有限公司 | A kind of financial consumption psychological test method of calibration based on social platform |
CN110264038A (en) * | 2019-05-22 | 2019-09-20 | 深圳壹账通智能科技有限公司 | A kind of generation method and equipment of product appraisal model |
CN110570941A (en) * | 2019-07-17 | 2019-12-13 | 北京智能工场科技有限公司 | System and device for assessing psychological state based on text semantic vector model |
CN110689261A (en) * | 2019-09-25 | 2020-01-14 | 苏州思必驰信息科技有限公司 | Service quality evaluation product customization platform and method |
CN110909532A (en) * | 2019-10-31 | 2020-03-24 | 银联智惠信息服务(上海)有限公司 | User name matching method and device, computer equipment and storage medium |
CN111192176A (en) * | 2019-12-30 | 2020-05-22 | 华中师范大学 | Online data acquisition method and device supporting education informatization assessment |
CN111522950A (en) * | 2020-04-26 | 2020-08-11 | 成都思维世纪科技有限责任公司 | Rapid identification system for unstructured massive text sensitive data |
CN112101005A (en) * | 2020-04-02 | 2020-12-18 | 上海迷因网络科技有限公司 | Method for generating and dynamically adjusting quick expressive force test questions |
CN112434897A (en) * | 2020-09-18 | 2021-03-02 | 国家电网有限公司客户服务中心 | Post value evaluation measuring and calculating system based on hierarchical branch calculation model |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1636220A (en) * | 2001-06-08 | 2005-07-06 | 布莱恩集思有限公司 | Method apparatus and computer program for generating and evaluating feelback from a plurality of respondents |
CN103810365A (en) * | 2012-11-13 | 2014-05-21 | 南京河海南自水电自动化有限公司 | Automatic grading method based on hydroelectric simulation training system |
CN103955874A (en) * | 2014-03-31 | 2014-07-30 | 西南林业大学 | Automatic subjective-question scoring system and method based on semantic similarity interval |
CN104375989A (en) * | 2014-12-01 | 2015-02-25 | 国家电网公司 | Natural language text keyword association network construction system |
CN106021288A (en) * | 2016-04-27 | 2016-10-12 | 南京慕测信息科技有限公司 | Method for rapid and automatic classification of classroom testing answers based on natural language analysis |
CN107229719A (en) * | 2017-05-31 | 2017-10-03 | 中南大学 | A kind of career values evaluation method and system |
CN107273861A (en) * | 2017-06-20 | 2017-10-20 | 广东小天才科技有限公司 | A kind of subjective question marking methods of marking, device and terminal device |
CN107330237A (en) * | 2017-05-12 | 2017-11-07 | 广州市润心教育咨询有限公司 | A kind of psychological condition appraisal procedure and system |
CN107506360A (en) * | 2016-06-14 | 2017-12-22 | 科大讯飞股份有限公司 | A kind of essay grade method and system |
US20190005090A1 (en) * | 2017-06-29 | 2019-01-03 | FutureWel Technologies, Inc. | Dynamic semantic networks for language understanding and question answering |
-
2018
- 2018-09-06 CN CN201811034545.9A patent/CN109299865B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1636220A (en) * | 2001-06-08 | 2005-07-06 | 布莱恩集思有限公司 | Method apparatus and computer program for generating and evaluating feelback from a plurality of respondents |
CN103810365A (en) * | 2012-11-13 | 2014-05-21 | 南京河海南自水电自动化有限公司 | Automatic grading method based on hydroelectric simulation training system |
CN103955874A (en) * | 2014-03-31 | 2014-07-30 | 西南林业大学 | Automatic subjective-question scoring system and method based on semantic similarity interval |
CN104375989A (en) * | 2014-12-01 | 2015-02-25 | 国家电网公司 | Natural language text keyword association network construction system |
CN106021288A (en) * | 2016-04-27 | 2016-10-12 | 南京慕测信息科技有限公司 | Method for rapid and automatic classification of classroom testing answers based on natural language analysis |
CN107506360A (en) * | 2016-06-14 | 2017-12-22 | 科大讯飞股份有限公司 | A kind of essay grade method and system |
CN107330237A (en) * | 2017-05-12 | 2017-11-07 | 广州市润心教育咨询有限公司 | A kind of psychological condition appraisal procedure and system |
CN107229719A (en) * | 2017-05-31 | 2017-10-03 | 中南大学 | A kind of career values evaluation method and system |
CN107273861A (en) * | 2017-06-20 | 2017-10-20 | 广东小天才科技有限公司 | A kind of subjective question marking methods of marking, device and terminal device |
US20190005090A1 (en) * | 2017-06-29 | 2019-01-03 | FutureWel Technologies, Inc. | Dynamic semantic networks for language understanding and question answering |
Non-Patent Citations (3)
Title |
---|
HARBISON, J.ISAIAH 等: "Automated scoring of originality using semantic representations", 《PROCEEDINGS OF COGSCI》 * |
吴志刚 等: "网络考试主动评分模型的开发与应用", 《计算机与教育:实践、创新、未来——全国计算机辅助教育学会第十六届学术年会论文集》 * |
贡喆 等: "有关创造力测量的一些思考", 《心理学进展》 * |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109903096B (en) * | 2019-03-05 | 2021-06-25 | 上海硕恩网络科技股份有限公司 | Financial consumption psychological test and verification method based on social platform |
CN109903096A (en) * | 2019-03-05 | 2019-06-18 | 上海硕恩网络科技股份有限公司 | A kind of financial consumption psychological test method of calibration based on social platform |
CN110264038A (en) * | 2019-05-22 | 2019-09-20 | 深圳壹账通智能科技有限公司 | A kind of generation method and equipment of product appraisal model |
CN110570941A (en) * | 2019-07-17 | 2019-12-13 | 北京智能工场科技有限公司 | System and device for assessing psychological state based on text semantic vector model |
CN110689261A (en) * | 2019-09-25 | 2020-01-14 | 苏州思必驰信息科技有限公司 | Service quality evaluation product customization platform and method |
US20220351266A1 (en) * | 2019-09-25 | 2022-11-03 | Ai Speech Co., Ltd. | Customization platform and method for service quality evaluation product |
CN110909532A (en) * | 2019-10-31 | 2020-03-24 | 银联智惠信息服务(上海)有限公司 | User name matching method and device, computer equipment and storage medium |
CN111192176A (en) * | 2019-12-30 | 2020-05-22 | 华中师范大学 | Online data acquisition method and device supporting education informatization assessment |
CN111192176B (en) * | 2019-12-30 | 2023-04-28 | 华中师范大学 | Online data acquisition method and device supporting informatization assessment of education |
CN112101005A (en) * | 2020-04-02 | 2020-12-18 | 上海迷因网络科技有限公司 | Method for generating and dynamically adjusting quick expressive force test questions |
CN112101005B (en) * | 2020-04-02 | 2022-08-30 | 上海迷因网络科技有限公司 | Method for generating and dynamically adjusting quick expressive force test questions |
CN111522950A (en) * | 2020-04-26 | 2020-08-11 | 成都思维世纪科技有限责任公司 | Rapid identification system for unstructured massive text sensitive data |
CN111522950B (en) * | 2020-04-26 | 2023-06-27 | 成都思维世纪科技有限责任公司 | Rapid identification system for unstructured massive text sensitive data |
CN112434897A (en) * | 2020-09-18 | 2021-03-02 | 国家电网有限公司客户服务中心 | Post value evaluation measuring and calculating system based on hierarchical branch calculation model |
Also Published As
Publication number | Publication date |
---|---|
CN109299865B (en) | 2021-12-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109299865A (en) | Psychological assessment system and method, information data processing terminal based on semantic analysis | |
Banks et al. | A review of best practice recommendations for text analysis in R (and a user-friendly app) | |
Roberts et al. | Investigating the emotional responses of individuals to urban green space using twitter data: A critical comparison of three different methods of sentiment analysis | |
Celli | Unsupervised personality recognition for social network sites | |
CN106610955A (en) | Dictionary-based multi-dimensional emotion analysis method | |
Lalata et al. | A sentiment analysis model for faculty comment evaluation using ensemble machine learning algorithms | |
Malherbe et al. | Bridge the terminology gap between recruiters and candidates: A multilingual skills base built from social media and linked data | |
Virmani et al. | Sentiment analysis using collaborated opinion mining | |
Schmidt et al. | Senttext: A tool for lexicon-based sentiment analysis in digital humanities | |
Chen et al. | Vector-based similarity measurements for historical figures | |
Devine et al. | Evaluating unsupervised text embeddings on software user feedback | |
CN113627797B (en) | Method, device, computer equipment and storage medium for generating staff member portrait | |
Oberbichler et al. | Topic-specific corpus building: A step towards a representative newspaper corpus on the topic of return migration using text mining methods | |
Flor et al. | Towards automatic annotation of collaborative problem‐solving skills in technology‐enhanced environments | |
CN112115712B (en) | Topic-based group emotion analysis method | |
Brook O’Donnell et al. | Linking neuroimaging with functional linguistic analysis to understand processes of successful communication | |
Li et al. | Expertise network discovery via topic and link analysis in online communities | |
AL-Rubaiee et al. | Tuning of Customer Relationship Management (CRM) via Customer Experience Management (CEM) using sentiment analysis on aspects level | |
O'Connor | Statistical Text Analysis for Social Science. | |
Al Bashaireh et al. | Towards a new indicator for evaluating universities based on twitter sentiment analysis | |
Abed et al. | Detecting subjectivity in staff perfomance appraisals by using text mining: Teachers appraisals of palestinian government case study | |
Rai et al. | Identification of landscape preferences by using social media analysis | |
Chauhan et al. | Implementing lda topic modelling technique to study user reviews in tourism | |
Li et al. | Twitter sentiment analysis of the 2016 US Presidential Election using an emoji training heuristic | |
Umamaheswaran et al. | Mapping Climate Themes From 2008-2021—An Analysis of Business News Using Topic Models |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |