CN111508289A - Language learning system based on word use frequency - Google Patents
Language learning system based on word use frequency Download PDFInfo
- Publication number
- CN111508289A CN111508289A CN202010291647.XA CN202010291647A CN111508289A CN 111508289 A CN111508289 A CN 111508289A CN 202010291647 A CN202010291647 A CN 202010291647A CN 111508289 A CN111508289 A CN 111508289A
- Authority
- CN
- China
- Prior art keywords
- word
- learning
- words
- difficulty
- value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09B—EDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
- G09B5/00—Electrically-operated educational appliances
- G09B5/06—Electrically-operated educational appliances with both visual and audible presentation of the material to be studied
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Business, Economics & Management (AREA)
- Educational Technology (AREA)
- Educational Administration (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Electrically Operated Instructional Devices (AREA)
Abstract
The invention discloses a language learning system based on word use frequency, which comprises a server and a user side, wherein a word frequency database, a learning resource database and a learning resource difficulty marking unit are arranged in the server, and learning resources in the learning resource database are learning resources with text contents; the word frequency database stores words of multiple languages, all the words in each language are sorted from high to low according to the frequency value of the words used in life, and the sorting sequence number of each word is marked; the learning difficulty marking unit is used for marking the difficulty of the learning resources in each input learning resource database based on the frequency value of the words; and the user side is used for calling learning resources corresponding to the difficulty value according to the acquired difficulty value suitable for the current user for the user to learn. The invention enables the user to learn the high-frequency words and then the low-frequency words to a certain extent, thereby reducing the learning difficulty.
Description
Technical Field
The invention belongs to the technical field of education software, and particularly relates to a language learning system based on word use frequency.
Background
The English level of most college graduates is less than that of English level of 6-year-old English children in England (because 6-year-old English children can freely understand English conversations among parents, friends and teachers, can express ideas, opinions and suggestions in English, can understand English stories and speak in English, can understand English contents in media such as television and the like, but cannot be realized by most college graduates in China). The reason is that the traditional foreign language education system does not conform to the learning and cognition rules, and the specific embodiment is as follows: the learning content is unscientific, the native language environment is lacked, and the listening is too little.
1. The learning content is unscientific: foreign language words and texts in the traditional course are selected subjectively by a writer, so that a plurality of high-frequency words are not learned, and a plurality of low-frequency words are learned firstly, so that a learner spends a lot of time to memorize a large number of low-frequency words which are not used, and a large number of low-frequency words like sheet, tiger and the like are not used for one month or even a plurality of months; the study interest is reduced due to the fact that the learned low-frequency vocabulary is difficult to use in practice, a large amount of dead notes are hard to remember, the study is not easy to use, and the like, so that the study is difficult to persist for a long time;
2. lack of native language environment: in reality, foreigners are not in long-term communication, and various audios and original videos in the internet cannot be immersed in the videos for a long time due to scattering, different depths and lack of assistance;
3. the learner in foreign language listens too little due to lack of the environment of the mother language, the English listening time of 6-year-old children in the United kingdom is at least 5 hours/day × 365 days/year × 6 years-10950 hours, the English listening time of general graduates in China is 15 minutes/class × 5 lessons/week × 20 weeks/school period × 32/60 minutes/hour-800 hours (note: 45-minute class per class, 2/3 time is Chinese explanation, only 15 minutes is English), and the learner in foreign language cannot understand the old speech due to too little listening.
By the above, to learn English well, the English must be based on the application of English in real life to gradually transition to the vocabulary learning that frequency of use is low according to the vocabulary that frequency of use is high, and build a native language environment as far as possible, let the student listen to pure english more, realize immersive English study, thereby improve student's learning efficiency.
Disclosure of Invention
The technical problem to be solved by the present invention is to provide a language learning system based on word usage frequency, which is to make a user learn high-frequency words and low-frequency words to a certain extent when learning resources from low to high according to the difficulty by marking difficulty marks on the learning resources based on word usage frequency, so as to reduce the learning difficulty and arouse learning interest.
In order to solve the technical problems, the invention adopts the technical scheme that: a language learning system based on word use frequency comprises a server and a user side, wherein a word frequency database, a learning resource database and a learning resource difficulty marking unit are arranged in the server, and learning resources in the learning resource database are learning resources with text contents;
the word frequency database stores words of multiple languages, all the words in each language are sorted from high to low according to the use frequency values of the words in life, the sorting sequence number is from small to large, and the sorting sequence number of each word is marked;
the learning difficulty marking unit is used for marking difficulty information of the learning resources in each input learning resource database based on the frequency value of the words;
and the user side is used for calling corresponding learning resources according to the difficulty information for the user to learn.
In the language learning system based on word usage frequency, when the learning difficulty labeling unit performs difficulty labeling on the learning resources in each input learning resource database based on the word frequency value, the language learning system based on word usage frequency includes the following steps:
step 1, acquiring text data of learning resources, and performing word segmentation processing on the text data;
step 2, carrying out duplicate removal treatment on the words obtained by segmentation;
step 3, inquiring the marked sequencing serial number of the deduplicated words in a word frequency database;
step 4, arranging the searched sequencing serial numbers from small to large to obtain a set Q;
step 5, taking the sequencing serial number B with the largest value and the sequencing serial number A with the smallest value from the last 20-50% of the sequencing serial numbers in the set Q;
and 6, marking the difficulty of the learning resources as [ A, B ].
In the language learning system based on word usage frequency, when the learning difficulty labeling unit performs difficulty labeling on the learning resources in each input learning resource database based on the word frequency value, the language learning system based on word usage frequency includes the following steps:
step 1, acquiring text data of learning resources, and performing word segmentation processing on the text data;
step 2, carrying out duplicate removal treatment on the words obtained by segmentation;
step 3, inquiring the marked sequencing serial number of the deduplicated words in a word frequency database;
step 4, arranging the searched sequencing serial numbers from small to large to obtain a set Q;
step 5, taking the last 20-50% of the sorting serial numbers in the set Q, and deleting the sorting serial numbers with the numerical value of more than 10000-12000 in the extracted sorting serial numbers;
step 6, extracting the sequencing serial number B with the largest value and the sequencing serial number A with the smallest value from the sequencing serial numbers left after deletion;
and 7, marking the difficulty of the learning resources as [ A, B ].
In the language learning system based on word usage frequency, when the learning difficulty labeling unit performs difficulty labeling on the learning resources in each input learning resource database based on the word frequency value, the language learning system based on word usage frequency includes the following steps:
step 1, acquiring text data of learning resources, and performing word segmentation processing on the text data;
step 2, carrying out duplicate removal treatment on the words obtained by segmentation;
step 3, inquiring the marked sequencing serial number of the deduplicated words in a word frequency database;
step 4, arranging the searched sequencing serial numbers from small to large to obtain a set Q;
step 5, taking the last 20-50% of the sorting serial numbers in the set Q, counting the number of the sorting serial numbers with the value of more than 10000-12000 in the extracted sorting serial numbers, wherein the counting result is P, judging whether P exceeds 2% of the total number of the sorting serial numbers in the set Q, if so, entering a step 6, and if not, entering a step 7;
step 6, taking the sequencing serial number B with the largest value and the sequencing serial number A with the smallest value from the last 20-50% of the sequencing serial numbers in the set Q; the difficulty of learning resources is marked as [ A, B ];
step 7, taking the last 20-50% of the sorting serial numbers in the set Q, and deleting the sorting serial numbers with the numerical value of more than 10000-12000 in the extracted sorting serial numbers; extracting a sequencing serial number B with the largest value and a sequencing serial number A with the smallest value from the sequencing serial numbers left after deletion; the difficulty of learning resources is labeled [ A, B ].
In the language learning system based on the word use frequency, a word dictionary module is arranged in the user side;
the word dictionary module comprises a word playing unit and a word learning video playing unit;
a forward switching button for switching to the word learning video playing unit is arranged on the word playing unit;
when the word playing unit is switched to the word learning video playing unit through the forward switching button, the word learning video playing unit calls the learning video of the word being played before the word playing unit is switched to play the learning video;
a reverse switching button for switching to the word playing unit is arranged on the word learning video playing unit;
when the word learning video playing unit is switched to the word playing unit through the reverse switching button, the word playing unit plays the words contained in the video played before the word learning video playing unit is switched according to the sequence number of the searched marks in the word frequency database from small to large.
In the language learning system based on the word use frequency, a word consolidation module is arranged in the user side;
the word consolidation module is used for presenting the words needing to be consolidated to the user;
the word consolidation module comprises the following steps when presenting the words needing consolidation to a user:
acquiring all word information learned by a user within T time units from the current time point;
inquiring the marked sequencing serial numbers of all the learned words in a word frequency database;
calculating a difficulty mark valueXiSorting sequence numbers of the marks inquired by the words in the word frequency database; n is the number of all words learned by the user in T time units;
rank ordering of query tokens in word frequency database at Q2The words in the range of +/-K are the words needing to be consolidated, and K is more than or equal to 100 and less than or equal to 300.
In the language learning system based on word use frequency, a language level testing module is arranged in the user side;
the language level testing module is used for testing the language level of the user;
the language level testing module comprises the following steps when testing the language level of a user:
step C1, acquiring language information needed to be used for testing;
c2, calling any word of the language information ordered between 2000 and 2500 in the word frequency database;
step C3, pushing preset test questions of the words to the user, obtaining test results, and recording the number of times of testing for 1 time; if the test result is correct, testing the value +1, and entering the step C4, if the test result is wrong, testing the value-1, and entering the step C5;
step C4, judging whether the testing frequency reaches a threshold value alpha, wherein alpha is larger than or equal to 3, if yes, outputting the sorting serial number of the currently tested words, and ending the test; if not, judging whether the test value is not less than 3, if so, enabling the test value to return to 0, calling the word corresponding to the ranking value after the ranking value of the tested word is plus 300, then returning to the step C3, otherwise, calling the word corresponding to the ranking value after the ranking value of the tested word is plus 10, and then returning to the step C3;
step C5, judging whether the testing frequency reaches a threshold value alpha, wherein alpha is larger than or equal to 3, if yes, outputting the sorting serial number of the currently tested words, and ending the test; if not, judging whether the test value is less than or equal to-3, if so, enabling the test value to be 0, calling the word corresponding to the ranking value after the ranking value of the tested word is-300, then returning to the step C3, otherwise, calling the word corresponding to the ranking value after the ranking value of the tested word is-10, and then returning to the step C3.
In the language learning system based on word use frequency, the user side calls corresponding learning resources according to the difficulty information to provide the user with the learning resources, and the language learning system comprises the following steps:
step a, acquiring data of all learning resource difficulty marks [ A, B ] which are learned by a user at present;
step B, marking all acquired difficulty marks [ A, B ]]Screening to screen out B with the maximum valuemax;
C, acquiring all words in all learning resources currently learned by the user, and removing duplication of all words;
step d, obtaining the value of the sequence number < B in the word frequency databasemaxAll the words of (1);
step e, comparing all the words obtained after the duplication removal in the step c with all the words obtained in the step d, and screening out the words which are not learned in all the words obtained in the step d;
f, arranging the unlearned words screened in the step e from small to large according to the numerical value of the sequencing sequence number;
step g, calling the words arranged at the first digit in the step f;
step h, searching out the learning resources containing the called words in the learning resources database, and marking the difficulty of the learning resources as [ A, B ]]B in (1) is less than Bmax;
And step i, pushing the learning resources searched in the step to the user.
In the language learning system based on word usage frequency, in step h, the learning resource database is searched for a learning resource containing the called word, and the difficulty label [ a, B ] of the learning resource]B in (1) is less than Bmax;
If the learning resource meeting the condition is not searched, the learning resource database is searched for the learning resource containing the called word, and the difficulty mark [ A, B ] of the learning resource]B in (1) is less than Bmax+300。
In the language learning system based on word use frequency, the user side calls corresponding learning resources according to the difficulty information to provide the user with the learning resources, and the language learning system comprises the following steps:
step 1), acquiring data of all learning resource difficulty marks [ A, B ] which are learned by a user at present;
step 2), all acquired difficulty marks [ A, B ]]Screening to screen out B with the maximum valuemax;
Step 3), all words in all learning resources which are currently learned by the user are obtained, and all words are deduplicated;
step 4), screening the words obtained in the step 3), and screening out the learning frequency T less than TyThe word of (a); t isyIs a threshold value;
step 5), arranging the words screened out in the step 4) from small to large according to the numerical value of the sequencing sequence number;
step 6), calling the words arranged at the first digit in the step 5);
step 7), searching out the learning resource containing the called word in the learning resource database, and marking the difficulty mark [ A, B ] of the learning resource]B in (1) is less than Bmax;
And 8) pushing the learning resources searched out in the step 7) to the user.
Compared with the prior art, the invention has the following advantages: according to the method, the difficulty mark is marked on the learning resource based on the word use frequency, so that a user can learn high-frequency words and low-frequency words to a certain extent when learning from low to high according to the difficulty, the learning difficulty is reduced, the learning interest is stimulated, and immersive English learning is realized, so that the learning efficiency of students is improved.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
Fig. 1 is a schematic block diagram of the present invention.
Detailed Description
As shown in fig. 1, a language learning system based on word usage frequency includes a server and a user, where the server is internally provided with a word frequency database, a learning resource database and a learning resource difficulty marking unit, and learning resources in the learning resource database are learning resources with text content; such as teaching videos, PPT lectures, teaching books, periodicals, and the like;
the word frequency database stores words of multiple languages, all the words in each language are sorted from high to low according to the use frequency values of the words in life, the sorting sequence number is from small to large, and the sorting sequence number of each word is marked;
the frequency value of the words used in life is a statistical value obtained by carrying out statistics on the use frequency of words in reality after large data analysis is carried out by extracting words in contents such as language mass different scenes, different conversations, movies, courseware and the like;
the learning difficulty marking unit is used for marking difficulty information of the learning resources in each input learning resource database based on the frequency value of the words;
and the user side is used for calling corresponding learning resources according to the difficulty information for the user to learn.
In this embodiment, when the learning difficulty marking unit performs difficulty marking on the word-based frequency value of the learning resource in each input learning resource database, the method includes the following steps:
step 1, acquiring text data of learning resources, and performing word segmentation processing on the text data;
step 2, carrying out duplicate removal treatment on the words obtained by segmentation;
step 3, inquiring the marked sequencing serial number of the deduplicated words in a word frequency database;
step 4, arranging the searched sequencing serial numbers from small to large to obtain a set Q;
step 5, taking the sequencing serial number B with the largest value and the sequencing serial number A with the smallest value from the last 20-50% of the sequencing serial numbers in the set Q;
and 6, marking the difficulty of the learning resources as [ A, B ].
In another embodiment of the present invention, when the learning difficulty labeling unit performs difficulty labeling on the frequency value of the learning resource based on the word in each input learning resource database, the method includes the following steps:
step 1, acquiring text data of learning resources, and performing word segmentation processing on the text data;
step 2, carrying out duplicate removal treatment on the words obtained by segmentation;
step 3, inquiring the marked sequencing serial number of the deduplicated words in a word frequency database;
step 4, arranging the searched sequencing serial numbers from small to large to obtain a set Q;
step 5, taking the last 20-50% of the sorting serial numbers in the set Q, and deleting the sorting serial numbers with the numerical value of more than 10000-12000 in the extracted sorting serial numbers;
step 6, extracting the sequencing serial number B with the largest value and the sequencing serial number A with the smallest value from the sequencing serial numbers left after deletion;
and 7, marking the difficulty of the learning resources as [ A, B ].
In another embodiment of the present invention, when the learning difficulty labeling unit performs difficulty labeling on the frequency value of the learning resource based on the word in each input learning resource database, the method includes the following steps:
step 1, acquiring text data of learning resources, and performing word segmentation processing on the text data;
step 2, carrying out duplicate removal treatment on the words obtained by segmentation;
step 3, inquiring the marked sequencing serial number of the deduplicated words in a word frequency database;
step 4, arranging the searched sequencing serial numbers from small to large to obtain a set Q;
step 5, taking the last 20-50% of the sorting serial numbers in the set Q, counting the number of the sorting serial numbers with the value of more than 10000-12000 in the extracted sorting serial numbers, wherein the counting result is P, judging whether P exceeds 2% of the total number of the sorting serial numbers in the set Q, if so, entering a step 6, and if not, entering a step 7;
step 6, taking the sequencing serial number B with the largest value and the sequencing serial number A with the smallest value from the last 20-50% of the sequencing serial numbers in the set Q; the difficulty of learning resources is marked as [ A, B ];
step 7, taking the last 20-50% of the sorting serial numbers in the set Q, and deleting the sorting serial numbers with the numerical value of more than 10000-12000 in the extracted sorting serial numbers; extracting a sequencing serial number B with the largest value and a sequencing serial number A with the smallest value from the sequencing serial numbers left after deletion; the difficulty of learning resources is labeled [ A, B ].
In this embodiment, a word dictionary module is arranged in the user side;
the word dictionary module comprises a word playing unit and a word learning video playing unit;
a forward switching button for switching to the word learning video playing unit is arranged on the word playing unit;
when the word playing unit is switched to the word learning video playing unit through the forward switching button, the word learning video playing unit calls the learning video of the word being played before the word playing unit is switched to play the learning video;
a reverse switching button for switching to the word playing unit is arranged on the word learning video playing unit;
when the word learning video playing unit is switched to the word playing unit through the reverse switching button, the word playing unit plays the words contained in the video played before the word learning video playing unit is switched according to the sequence number of the searched marks in the word frequency database from small to large.
It should be noted that, a learning video repository for storing a learning video for each word is further provided in the server, and when the word playing unit is switched to the word learning video playing unit by the forward switching button, the learning video playing unit retrieves the learning video of the word being played before the word playing unit is switched from the learning video repository to play the learning video.
In this embodiment, a word consolidation module is arranged in the user side;
the word consolidation module is used for presenting the words needing to be consolidated to the user;
the word consolidation module comprises the following steps when presenting the words needing consolidation to a user:
acquiring all word information learned by a user within T time units from the current time point;
inquiring the marked sequencing serial numbers of all the learned words in a word frequency database;
calculating a difficulty mark valueXiSorting sequence numbers of the marks inquired by the words in the word frequency database; n is the number of all words learned by the user in T time units;
rank ordering of query tokens in word frequency database at Q2The words in the range of +/-K are the words needing to be consolidated, and K is more than or equal to 100 and less than or equal to 300.
In this embodiment, a language level testing module is arranged in the user side;
the language level testing module is used for testing the language level of the user;
the language level testing module comprises the following steps when testing the language level of a user:
step C1, acquiring language information needed to be used for testing;
c2, calling any word of the language information ordered between 2000 and 2500 in the word frequency database;
step C3, pushing preset test questions of the words to the user, obtaining test results, and recording the number of times of testing for 1 time; if the test result is correct, testing the value +1, and entering the step C4, if the test result is wrong, testing the value-1, and entering the step C5;
step C4, judging whether the testing frequency reaches a threshold value alpha, wherein alpha is larger than or equal to 3, if yes, outputting the sorting serial number of the currently tested words, and ending the test; if not, judging whether the test value is not less than 3, if so, enabling the test value to return to 0, calling the word corresponding to the ranking value after the ranking value of the tested word is plus 300, then returning to the step C3, otherwise, calling the word corresponding to the ranking value after the ranking value of the tested word is plus 10, and then returning to the step C3;
step C5, judging whether the testing frequency reaches a threshold value alpha, wherein alpha is larger than or equal to 3, if yes, outputting the sorting serial number of the currently tested words, and ending the test; if not, judging whether the test value is less than or equal to-3, if so, enabling the test value to be 0, calling the word corresponding to the ranking value after the ranking value of the tested word is-300, then returning to the step C3, otherwise, calling the word corresponding to the ranking value after the ranking value of the tested word is-10, and then returning to the step C3.
In this embodiment, the user side calls the corresponding learning resource according to the difficulty information to provide the user with the learning resources, which includes the following steps:
step a, acquiring data of all learning resource difficulty marks [ A, B ] which are learned by a user at present;
step B, marking all acquired difficulty marks [ A, B ]]Screening to screen out B with the maximum valuemax;
C, acquiring all words in all learning resources currently learned by the user, and removing duplication of all words;
step d, obtaining the value of the sequence number < B in the word frequency databasemaxAll the words of (1);
step e, comparing all the words obtained after the duplication removal in the step c with all the words obtained in the step d, and screening out the words which are not learned in all the words obtained in the step d;
f, arranging the unlearned words screened in the step e from small to large according to the numerical value of the sequencing sequence number;
step g, calling the words arranged at the first digit in the step f;
step h, searching out the learning resources containing the called words in the learning resources database, and marking the difficulty of the learning resources as [ A, B ]]B in (1) is less than Bmax;
And step i, pushing the learning resources searched in the step to the user.
In this embodiment, in the step h, a learning resource database is searched for a learning resource containing the called word, and the difficulty label [ a, B ] of the learning resource is]B in (1) is less than Bmax;
If the learning resource meeting the condition is not searched, the learning resource database is searched for the learning resource containing the called word, and the difficulty mark [ A, B ] of the learning resource]B in (1) is less than Bmax+300。
In another embodiment of the present invention, the user terminal invokes the corresponding learning resource according to the difficulty information to provide the user with the learning resources, which includes the following steps:
step 1), acquiring data of all learning resource difficulty marks [ A, B ] which are learned by a user at present;
step 2), all acquired difficulty marks [ A, B ]]Screening to screen out B with the maximum valuemax;
Step 3), all words in all learning resources which are currently learned by the user are obtained, and all words are deduplicated;
step 4) for step 3) to obtainSelecting words and phrases, and selecting out the learning frequency T less than TyThe word of (a); t isyIs a threshold value; preferably, Ty=20;
Step 5), arranging the words screened out in the step 4) from small to large according to the numerical value of the sequencing sequence number;
step 6), calling the words arranged at the first digit in the step 5);
step 7), searching out the learning resource containing the called word in the learning resource database, and marking the difficulty mark [ A, B ] of the learning resource]B in (1) is less than Bmax;
And 8) pushing the learning resources searched out in the step 7) to the user.
It is to be noted that by defining B < BmaxThe method and the device can ensure that the learning resources pushed to the user do not have too high difficulty, and are suitable for the current learning ability of the user.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and all simple modifications, changes and equivalent structural changes made to the above embodiment according to the technical spirit of the present invention still fall within the protection scope of the technical solution of the present invention.
Claims (10)
1. A language learning system based on word usage frequency, comprising: the system comprises a server and a user side, wherein a word frequency database, a learning resource database and a learning resource difficulty marking unit are arranged in the server, and learning resources in the learning resource database are learning resources with text contents;
the word frequency database stores words of multiple languages, all the words in each language are sorted from high to low according to the use frequency values of the words in life, the sorting sequence number is from small to large, and the sorting sequence number of each word is marked;
the learning difficulty marking unit is used for marking difficulty information of the learning resources in each input learning resource database based on the frequency value of the words;
and the user side is used for calling corresponding learning resources according to the difficulty information for the user to learn.
2. A language learning system based on word usage frequency according to claim 1, characterized in that: when the learning difficulty marking unit marks the difficulty of the learning resources in each input learning resource database based on the frequency value of the words, the method comprises the following steps:
step 1, acquiring text data of learning resources, and performing word segmentation processing on the text data;
step 2, carrying out duplicate removal treatment on the words obtained by segmentation;
step 3, inquiring the marked sequencing serial number of the deduplicated words in a word frequency database;
step 4, arranging the searched sequencing serial numbers from small to large to obtain a set Q;
step 5, taking the sequencing serial number B with the largest value and the sequencing serial number A with the smallest value from the last 20-50% of the sequencing serial numbers in the set Q;
and 6, marking the difficulty of the learning resources as [ A, B ].
3. A language learning system based on word usage frequency according to claim 1, characterized in that: when the learning difficulty marking unit marks the difficulty of the learning resources in each input learning resource database based on the frequency value of the words, the method comprises the following steps:
step 1, acquiring text data of learning resources, and performing word segmentation processing on the text data;
step 2, carrying out duplicate removal treatment on the words obtained by segmentation;
step 3, inquiring the marked sequencing serial number of the deduplicated words in a word frequency database;
step 4, arranging the searched sequencing serial numbers from small to large to obtain a set Q;
step 5, taking the last 20-50% of the sorting serial numbers in the set Q, and deleting the sorting serial numbers with the numerical value of more than 10000-12000 in the extracted sorting serial numbers;
step 6, extracting the sequencing serial number B with the largest value and the sequencing serial number A with the smallest value from the sequencing serial numbers left after deletion;
and 7, marking the difficulty of the learning resources as [ A, B ].
4. A language learning system based on word usage frequency according to claim 1, characterized in that: when the learning difficulty marking unit marks the difficulty of the learning resources in each input learning resource database based on the frequency value of the words, the method comprises the following steps:
step 1, acquiring text data of learning resources, and performing word segmentation processing on the text data;
step 2, carrying out duplicate removal treatment on the words obtained by segmentation;
step 3, inquiring the marked sequencing serial number of the deduplicated words in a word frequency database;
step 4, arranging the searched sequencing serial numbers from small to large to obtain a set Q;
step 5, taking the last 20-50% of the sorting serial numbers in the set Q, counting the number of the sorting serial numbers with the value of more than 10000-12000 in the extracted sorting serial numbers, wherein the counting result is P, judging whether P exceeds 2% of the total number of the sorting serial numbers in the set Q, if so, entering a step 6, and if not, entering a step 7;
step 6, taking the sequencing serial number B with the largest value and the sequencing serial number A with the smallest value from the last 20-50% of the sequencing serial numbers in the set Q; the difficulty of learning resources is marked as [ A, B ];
step 7, taking the last 20-50% of the sorting serial numbers in the set Q, and deleting the sorting serial numbers with the numerical value of more than 10000-12000 in the extracted sorting serial numbers; extracting a sequencing serial number B with the largest value and a sequencing serial number A with the smallest value from the sequencing serial numbers left after deletion; the difficulty of learning resources is labeled [ A, B ].
5. A language learning system based on word usage frequency according to claim 1, characterized in that: a word dictionary module is arranged in the user side;
the word dictionary module comprises a word playing unit and a word learning video playing unit;
a forward switching button for switching to the word learning video playing unit is arranged on the word playing unit;
when the word playing unit is switched to the word learning video playing unit through the forward switching button, the word learning video playing unit calls the learning video of the word being played before the word playing unit is switched to play the learning video;
a reverse switching button for switching to the word playing unit is arranged on the word learning video playing unit;
when the word learning video playing unit is switched to the word playing unit through the reverse switching button, the word playing unit plays the words contained in the video played before the word learning video playing unit is switched according to the sequence number of the searched marks in the word frequency database from small to large.
6. A language learning system based on word usage frequency according to claim 1, characterized in that: a word consolidation module is arranged in the user side;
the word consolidation module is used for presenting the words needing to be consolidated to the user;
the word consolidation module comprises the following steps when presenting the words needing consolidation to a user:
acquiring all word information learned by a user within T time units from the current time point;
inquiring the marked sequencing serial numbers of all the learned words in a word frequency database;
calculating a difficulty mark valueXiSorting sequence numbers of the marks inquired by the words in the word frequency database; n is the number of all words learned by the user in T time units;
rank ordering of query tokens in word frequency database at Q2The words in the range of +/-K are the words needing to be consolidated, and K is more than or equal to 100 and less than or equal to 300.
7. A language learning system based on word usage frequency according to claim 1, characterized in that: a language level testing module is arranged in the user side;
the language level testing module is used for testing the language level of the user;
the language level testing module comprises the following steps when testing the language level of a user:
step C1, acquiring language information needed to be used for testing;
c2, calling any word of the language information ordered between 2000 and 2500 in the word frequency database;
step C3, pushing preset test questions of the words to the user, obtaining test results, and recording the number of times of testing for 1 time; if the test result is correct, testing the value +1, and entering the step C4, if the test result is wrong, testing the value-1, and entering the step C5;
step C4, judging whether the testing frequency reaches a threshold value alpha, wherein alpha is larger than or equal to 3, if yes, outputting the sorting serial number of the currently tested words, and ending the test; if not, judging whether the test value is not less than 3, if so, enabling the test value to return to 0, calling the word corresponding to the ranking value after the ranking value of the tested word is plus 300, then returning to the step C3, otherwise, calling the word corresponding to the ranking value after the ranking value of the tested word is plus 10, and then returning to the step C3;
step C5, judging whether the testing frequency reaches a threshold value alpha, wherein alpha is larger than or equal to 3, if yes, outputting the sorting serial number of the currently tested words, and ending the test; if not, judging whether the test value is less than or equal to-3, if so, enabling the test value to be 0, calling the word corresponding to the ranking value after the ranking value of the tested word is-300, then returning to the step C3, otherwise, calling the word corresponding to the ranking value after the ranking value of the tested word is-10, and then returning to the step C3.
8. A language learning system based on word usage frequency according to claim 2 or 3 or 4 characterized in that: the user side calls corresponding learning resources according to the difficulty information for the user to learn, and the method comprises the following steps:
step a, acquiring data of all learning resource difficulty marks [ A, B ] which are learned by a user at present;
step B, marking all acquired difficulty marks [ A, B ]]Screening to screen out B with the maximum valuemax;
C, acquiring all words in all learning resources currently learned by the user, and removing duplication of all words;
step d, obtaining the value of the sequence number < B in the word frequency databasemaxAll the words of (1);
step e, comparing all the words obtained after the duplication removal in the step c with all the words obtained in the step d, and screening out the words which are not learned in all the words obtained in the step d;
f, arranging the unlearned words screened in the step e from small to large according to the numerical value of the sequencing sequence number;
step g, calling the words arranged at the first digit in the step f;
step h, searching out the learning resources containing the called words in the learning resources database, and marking the difficulty of the learning resources as [ A, B ]]B in (1) is less than Bmax;
And step i, pushing the learning resources searched in the step to the user.
9. A language learning system based on word usage frequency according to claim 8 wherein: in the step h, the learning resource database is searched for the learning resource containing the called word, and the difficulty mark [ A, B ] of the learning resource]B in (1) is less than Bmax;
If the learning resource meeting the condition is not searched, the learning resource database is searched for the learning resource containing the called word, and the difficulty mark [ A, B ] of the learning resource]B in (1) is less than Bmax+300。
10. A language learning system based on word usage frequency according to claim 1, characterized in that: the user side calls corresponding learning resources according to the difficulty information for the user to learn, and the method comprises the following steps:
step 1), acquiring data of all learning resource difficulty marks [ A, B ] which are learned by a user at present;
step 2), all acquired difficulty marks [ A, B ]]Screening to screen out B with the maximum valuemax;
Step 3), all words in all learning resources which are currently learned by the user are obtained, and all words are deduplicated;
step 4), screening the words obtained in the step 3), and screening out the learning frequency T less than TyThe word of (a); t isyIs a threshold value;
step 5), arranging the words screened out in the step 4) from small to large according to the numerical value of the sequencing sequence number;
step 6), calling the words arranged at the first digit in the step 5);
step 7), searching out the learning resource containing the called word in the learning resource database, and marking the difficulty mark [ A, B ] of the learning resource]B in (1) is less than Bmax;
And 8) pushing the learning resources searched out in the step 7) to the user.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010291647.XA CN111508289B (en) | 2020-04-14 | 2020-04-14 | Language learning system based on word use frequency |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010291647.XA CN111508289B (en) | 2020-04-14 | 2020-04-14 | Language learning system based on word use frequency |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111508289A true CN111508289A (en) | 2020-08-07 |
CN111508289B CN111508289B (en) | 2021-10-08 |
Family
ID=71876003
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010291647.XA Active CN111508289B (en) | 2020-04-14 | 2020-04-14 | Language learning system based on word use frequency |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111508289B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112991122A (en) * | 2021-05-10 | 2021-06-18 | 北京世纪好未来教育科技有限公司 | Planning method and device for Chinese character teaching |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090197225A1 (en) * | 2008-01-31 | 2009-08-06 | Kathleen Marie Sheehan | Reading level assessment method, system, and computer program product for high-stakes testing applications |
CN101814066A (en) * | 2009-02-23 | 2010-08-25 | 富士通株式会社 | Text reading difficulty judging device and method thereof |
CN101877181A (en) * | 2009-04-28 | 2010-11-03 | 夏普株式会社 | Be used for generating automatically the apparatus and method of personalized learning and diagnostic exercises |
CN104392640A (en) * | 2014-11-07 | 2015-03-04 | 曾立人 | Computer assisted foreign language corpus providing method and system |
US20170316708A1 (en) * | 2016-04-29 | 2017-11-02 | Rovi Guides, Inc. | Systems and methods for providing word definitions based on user exposure |
US20180061274A1 (en) * | 2016-08-27 | 2018-03-01 | Gereon Frahling | Systems and methods for generating and delivering training scenarios |
CN107943993A (en) * | 2017-12-04 | 2018-04-20 | 西北民族大学 | A kind of method for learning Chinese and system based on complex network |
-
2020
- 2020-04-14 CN CN202010291647.XA patent/CN111508289B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090197225A1 (en) * | 2008-01-31 | 2009-08-06 | Kathleen Marie Sheehan | Reading level assessment method, system, and computer program product for high-stakes testing applications |
CN101814066A (en) * | 2009-02-23 | 2010-08-25 | 富士通株式会社 | Text reading difficulty judging device and method thereof |
CN101877181A (en) * | 2009-04-28 | 2010-11-03 | 夏普株式会社 | Be used for generating automatically the apparatus and method of personalized learning and diagnostic exercises |
CN104392640A (en) * | 2014-11-07 | 2015-03-04 | 曾立人 | Computer assisted foreign language corpus providing method and system |
US20170316708A1 (en) * | 2016-04-29 | 2017-11-02 | Rovi Guides, Inc. | Systems and methods for providing word definitions based on user exposure |
US20180061274A1 (en) * | 2016-08-27 | 2018-03-01 | Gereon Frahling | Systems and methods for generating and delivering training scenarios |
CN107943993A (en) * | 2017-12-04 | 2018-04-20 | 西北民族大学 | A kind of method for learning Chinese and system based on complex network |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112991122A (en) * | 2021-05-10 | 2021-06-18 | 北京世纪好未来教育科技有限公司 | Planning method and device for Chinese character teaching |
Also Published As
Publication number | Publication date |
---|---|
CN111508289B (en) | 2021-10-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20070112554A1 (en) | System of interactive dictionary | |
CN108363743A (en) | A kind of intelligence questions generation method, device and computer readable storage medium | |
Devitt | Transferability and genres | |
Purwanto et al. | LISTENING COMPREHENSION STUDY: DIFFICULTIES AND STRATEGIES USED BY COLLEGE STUDENTS | |
CN109408803A (en) | A method of it semantic understanding for subjective item natural language and corrects | |
CN115640368A (en) | Method and system for intelligently diagnosing recommended question bank | |
Dolba | Technical Terms Used in General English Textbooks Across Disciplines | |
CN111508289B (en) | Language learning system based on word use frequency | |
CN110390032B (en) | Method and system for reading handwritten composition | |
CN111326030A (en) | Reading, dictation and literacy integrated learning system, device and method | |
CN113254752B (en) | Lesson preparation method and device based on big data and storage medium | |
CN112164262A (en) | Intelligent paper reading tutoring system | |
Zhu | A study on the application of automated essay scoring in college english writing based on pigai | |
Koh | The efficacy of basic sentence pattern approach for EFL learners in writing | |
Uchenwoke | An analysis on the Chinese language learning needs and challenges: A case study of Nigerian Chinese language students | |
Lou | Study on vocabulary learning strategies for Chinese English-Majors | |
Zhao et al. | Design and Implementation of a Teaching Verbal Behavior Analysis Aid in Instructional Videos | |
Kirana | Vocabulary Exposure to Islamic Institute Students Through an EFL Coursebook | |
Chencho | Effects of vocabulary instruction using bottom-up and top-down instructional approaches on Bhutanese secondary students’ vocabulary knowledge | |
Liu et al. | The Pedagogical Effects of Lexical Chunks on Chinese EFL Learners' Writing Proficiency | |
Jassim | The Effect of Digital Games on English Vocabulary Learning: A Meta-Analysis | |
Nababan et al. | An ANALYSIS OF GRAMMATICAL ERRORS IN WRITING NARRATIVE TEXT B: AN ANALYSIS OF GRAMMATICAL ERRORS IN WRITING NARRATIVE TEXT BY STUDENTS GRADE EIGHT AT SMP SWASTA VALENTINE DELI SERDANG | |
Ibrahim et al. | The Student’s Perceptions of Netflix Movies in Learning English to Improve Writing Skills on Vocational High School | |
Munawaroh | Investigating Vocabulary Acquisition of Second Semester Students of English Department at UNISMA | |
Yulia et al. | The effect of an English TV series with a bimodal subtitle on students' vocabulary acquisition |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |