CN113934922A

CN113934922A - Intelligent recommendation method, device, equipment and computer storage medium

Info

Publication number: CN113934922A
Application number: CN202010674177.5A
Authority: CN
Inventors: 孔令凯; 乔丽雯; 刘艳蕊; 邢荣荣; 杨育; 龚昌博; 王建佳
Original assignee: China Mobile Communications Group Co Ltd; China Mobile Chengdu ICT Co Ltd
Current assignee: China Mobile Communications Group Co Ltd; China Mobile Chengdu ICT Co Ltd
Priority date: 2020-07-14
Filing date: 2020-07-14
Publication date: 2022-01-14

Abstract

The embodiment of the invention provides an intelligent recommendation method, device, equipment and computer storage medium, wherein the method comprises the following steps: obtaining wrong question information corresponding to wrong question identification in the picture, screening a recommended question corresponding to the wrong question information from a preset question database, and pushing the recommended question to a terminal of a user. According to the wrong question information acquired by the embodiment of the invention, wrong questions are accurately extracted, the knowledge points of the wrong questions are analyzed, and the recommended questions are screened based on the knowledge points of the wrong questions.

Description

Intelligent recommendation method, device, equipment and computer storage medium

Technical Field

The invention belongs to the technical field of computer application, and particularly relates to an intelligent recommendation method, device and equipment and a computer storage medium.

Background

With the development of computer technology, photographing and searching questions become an indispensable learning tool for modern students. The study of primary and secondary school students is heavy, a large amount of exercise needs to be done on a flat day to consolidate knowledge points, the exercise of reviewing wrong questions is a crucial step in the learning process, the exercise is continuously conducted around the knowledge points related to the wrong questions, weak points of the students are better overcome, and the time for collecting the exercises by the students can be effectively shortened by photographing and searching the questions.

At present, based on a hypertext Markup Language (HTML) 5 technology, taking a photo search question as a functional cut-in, an Optical Character Recognition (OCR) image Recognition engine is used to collect user behavior data, and personalized resource recommendation is realized through machine learning analysis modeling and an intelligent algorithm. However, the conventional OCR technology focuses on recognizing images and converting the images into text symbols, does not perform targeted analysis and extraction on recognition results, and cannot accurately recommend exercise resources according to knowledge points related to wrong exercises.

Disclosure of Invention

The embodiment of the invention provides an intelligent recommendation method, device and equipment and a computer storage medium, which can extract wrong questions, extract knowledge points of the wrong questions based on text analysis and recommend resources related to the wrong knowledge points.

In a first aspect, an embodiment of the present invention provides an intelligent recommendation method, where the method includes:

acquiring wrong question information corresponding to wrong question identification in the picture;

screening a recommended question corresponding to wrong question information from a preset question database;

and pushing the recommended title to the terminal of the user.

In an optional implementation manner, obtaining error question information corresponding to an error question identifier in a picture includes:

identifying character information corresponding to wrong question identification in the picture;

searching position information of a starting character and position information of an ending character of the character information;

and acquiring error question information according to the position information of the initial character and the position information of the end character.

In an optional implementation manner, recognizing character information corresponding to a wrong question identifier in a picture includes:

extracting text block information in the picture;

extracting text line information based on the text block information;

extracting single character information based on the text line information;

and identifying the single character information through an optical character identification model to obtain the character information.

In an optional implementation manner, screening a recommended topic corresponding to wrong topic information from a preset topic database includes:

calculating the similarity between the wrong question information and the questions in the question database to obtain the questions to be recommended corresponding to the wrong question information;

and calculating the importance of the questions to be recommended, and screening the front N recommended questions corresponding to the wrong question information according to the arrangement sequence of the importance from high to low.

In an alternative implementation, the similarity is calculated based on a deep semantic matching model.

In an alternative implementation, the formula for calculating the importance is:

wherein d is a damping coefficient, w_jiFor topic V_iAnd V_jSimilarity of (c), ln (V)_i) To point to node V_iSet of nodes of, out (V)_j) To a slave node V_iSet of nodes, WS (V), to which the starting edge points_i) For topic V_iOf importance, WS (V)_j) For topic V_jImportance of, w_jkFor topic V_jAnd V_kAnd (4) calculating the importance of the to-be-recommended question through iteration.

In a second aspect, an embodiment of the present invention provides an intelligent recommendation apparatus, where the apparatus includes:

the acquisition module is used for acquiring wrong question information corresponding to the wrong question identification in the picture;

the screening module is used for screening recommended questions corresponding to wrong question information from a preset question database;

and the recommendation module is used for pushing the recommendation question to the terminal of the user.

In an optional implementation manner, the obtaining module includes:

the recognition module is used for recognizing character information corresponding to the wrong question mark in the picture;

the positioning module is used for searching the position information of the initial character and the position information of the end character of the character information;

and the extraction module is used for acquiring the wrong question information according to the position information of the initial character and the position information of the end character.

In a third aspect, an embodiment of the present invention provides an intelligent recommendation device, where the device includes: a processor, and a memory storing computer program instructions; the processor reads and executes the computer program instructions to implement the intelligent recommendation method in the first aspect or any one of the possible implementation manners of the first aspect.

In a fourth aspect, an embodiment of the present invention provides a computer storage medium, where computer program instructions are stored on the computer storage medium, and when the computer program instructions are executed by a processor, the method for intelligently recommending in the first aspect or any one of the possible implementation manners of the first aspect is implemented.

The embodiment of the invention provides an intelligent recommendation method, an intelligent recommendation device, an intelligent recommendation equipment and a computer storage medium, wherein the method comprises the following steps: the method comprises the steps of obtaining wrong question information corresponding to wrong question identification in a picture, screening recommended questions corresponding to the wrong question information from a preset question database, pushing the recommended questions to a terminal of a user, accurately extracting the wrong questions, analyzing knowledge points of the wrong questions, and screening the recommended questions based on the knowledge points of the wrong questions.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the embodiments of the present invention will be briefly described below, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a schematic flowchart of an intelligent recommendation method according to an embodiment of the present invention;

fig. 2 is a schematic flow chart of acquiring error problem information according to an embodiment of the present invention;

FIG. 3 is a deep neural network for an intelligent recommendation method according to an embodiment of the present invention;

fig. 4A is an original text of a text block segmentation method according to an embodiment of the present invention;

fig. 4B is a cut text of a text block cutting method according to an embodiment of the present invention;

fig. 5A is a text vertical projection diagram of a method for extracting a line text according to an embodiment of the present invention;

fig. 5B is a horizontal projection diagram of a text of a method for extracting a line text according to an embodiment of the present invention;

fig. 5C is a schematic line cutting diagram of a line text extraction method according to an embodiment of the present invention;

FIG. 6 is a diagram of a common mathematical notation of the K12 question bank provided by an embodiment of the present invention;

FIG. 7 is a diagram illustrating image recognition of an intelligent recommendation method according to an embodiment of the present invention;

FIG. 8 is a schematic diagram of extracting a single-channel topic in an intelligent recommendation method according to an embodiment of the present invention;

FIG. 9A is a diagram of a training sample of a deblurring method provided by an embodiment of the present invention;

FIG. 9B is a fuzzy graph of a deblurring method provided by embodiments of the present invention;

fig. 9C is a reconstructed picture of a deblurring method according to an embodiment of the present invention;

fig. 10 is a schematic structural diagram of a DSSM neural network for an intelligent recommendation method according to an embodiment of the present invention;

fig. 11 is a training variation curve of a DSSM neural network according to an embodiment of the present invention;

fig. 12A is a knowledge point grasp diagram of a student of an intelligent recommendation method according to an embodiment of the present invention;

FIG. 12B is a chart of topic similarity relationships for an intelligent recommendation method according to an embodiment of the present invention;

FIG. 13 is a schematic structural diagram of a recurrent neural network of an intelligent recommendation method according to an embodiment of the present invention;

fig. 14 is a schematic structural diagram of an intelligent recommendation device according to an embodiment of the present invention;

fig. 15 is a schematic structural diagram of an obtaining module of an intelligent recommendation device according to an embodiment of the present invention;

fig. 16 is a schematic structural diagram of an intelligent recommendation device according to an embodiment of the present invention.

Detailed Description

Features and exemplary embodiments of various aspects of the present invention will be described in detail below, and in order to make objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not to be construed as limiting the invention. It will be apparent to one skilled in the art that the present invention may be practiced without some of these specific details. The following description of the embodiments is merely intended to provide a better understanding of the present invention by illustrating examples of the present invention.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The study method is an effective study method for middle and primary school students to classify and analyze the wrong questions in post-school homework and test paper and carry out a large amount of exercises in a targeted manner so that the students can better master weak knowledge points, and can also have more focus in later review, so that a round of redundant extra-school questions is reduced, but too much time and energy are needed for manually searching a large number of questions, and the time for the students to effectively study is reduced. The existing question recommendation mechanism needs students to do questions in a system, can recommend questions after generating wrong question history, and adds a large amount of exercise exercises besides the homework arranged in schools, thereby undoubtedly increasing the work burden of the students.

At present, the problem is photographed and searched based on the HTML 5 technology, an OCR image recognition engine is used for collecting picture information, for picture scenes with mathematical expressions and text mixing and pictures with unclear pictures due to the conditions of low resolution, high blur, low illumination and the like of a photographing environment, the traditional OCR technology is difficult to recognize, the accuracy of recognizing text information is reduced, analysis on the problem is not facilitated, and the problem cannot be pushed accurately.

In order to solve the prior art problems, embodiments of the present invention provide an intelligent recommendation method, apparatus, device, and computer storage medium, where character information corresponding to a wrong question identifier in a picture is identified, position information of a start character and position information of an end character of the character information are searched, and wrong question information is obtained according to the position information of the start character and the position information of the end character, so that accuracy of wrong question identification and positioning is improved, cutting accuracy of split wrong questions is improved, and it is helpful to cut wrong questions quickly and accurately, a recommended question corresponding to wrong question information is screened from a preset question database, and the recommended question is pushed to a terminal of a user, so that the user can quickly, conveniently, and accurately obtain a large number of questions related to wrong question knowledge points.

First, the intelligent recommendation method provided by the embodiment of the invention is described below.

Fig. 1 is a flowchart illustrating an intelligent recommendation method according to an embodiment of the present invention. As shown in fig. 1, the method may include the steps of:

s110, wrong question information corresponding to wrong question identification in the picture is obtained.

Obtaining wrong question information containing wrong question marks on test paper, textbooks or teaching and auxiliary books through scanning or photographing, wherein the wrong question marks can be signs corrected when teachers review the test paper or work, notes supplemented when students conduct self-test, and the notes can be correction records of wrong questions or supplementary records which are not thought in answering questions; the wrong question information includes: all characters that may appear in the title such as words, formulas, letters, punctuation marks, etc.

By analyzing the picture information in whole page, the problem identification and positioning are intelligently realized, the accuracy of image identification is improved, the cutting precision of the segmentation error problem is improved, the error problem after cutting is more accurate and complete, and accurate and effective basis is provided for the problem recommendation in the later period. See, in particular, S111-S113.

S120, screening a recommended topic corresponding to wrong topic information from a preset topic database.

In order to better master knowledge points related to wrong questions, a user needs to acquire a large number of questions for strengthening exercise, and therefore after wrong question information corresponding to wrong question marks in a picture is acquired, the wrong question information is input into a preset question database, the question database can extract knowledge points related to the questions and wrong question recommendation history of the user according to the wrong question information, intelligently searches questions to be recommended containing the wrong question knowledge points, and determines recommended questions according to the relation between the wrong questions and the questions to be recommended.

S130, pushing the recommended title to the terminal of the user.

The system forms personalized recommendations according to the recommendation questions screened by the database, feeds the personalized recommendations back to the user terminal for the user to carry out exclusive exercises, and stores results into the database after a submission behavior is generated in each exercise process so as to know the mastering degree of the user on wrong-question knowledge points.

The user's terminal may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices.

In the embodiment of the invention, wrong question information corresponding to wrong question identification in a picture is obtained, then a recommended question corresponding to the wrong question information is screened from a preset question database, and finally the recommended question is pushed to a terminal of a user. By analyzing the picture information in whole page, the problem identification and positioning are intelligently realized, the accuracy of image identification is improved, the cutting precision of segmentation error problems is improved, the cut error problems are more accurate and complete, and the problem database is favorable for efficiently and accurately screening the recommended problems. Therefore, the user can carry out targeted practice on weak knowledge points exposed in daily homework and test paper according to an individual study plan, does not need to do questions in a certain software system, can obtain recommended questions about the weak knowledge points, and reduces the pressure of the user on extracurricular homework.

Fig. 2 is a schematic flow chart of obtaining error question information corresponding to an error question identifier in a picture, including:

and S111, identifying character information corresponding to wrong question identification in the picture.

In an alternative embodiment, image recognition requires that a template pattern is generated first, then normalization processing is performed on the template pattern to obtain a normalized bitmap, meta information of the pattern is recorded in order to ensure that the normalized bitmap is not distorted, finally matching between the template pattern and the pattern to be matched is performed, a matching algorithm is used to calculate a similarity value between each pattern to be matched and the template pattern, a most probable ASC code value is found out, and a recognition result is obtained.

In another alternative embodiment, a solution completely different from the above alternative embodiment is proposed, in which a text block is extracted from the entire page of picture information, so as to accurately distinguish the positions of paragraphs, tables and formulas of the text, which is beneficial to the splicing of the recognition results of the text. Specifically, as shown in fig. 3, a text is cut through a U-shaped deep neural network, and training is performed through deep learning to obtain a neural network model with a text cutting function, where the model takes a whole picture as input, and adopts features of extracting the text through multiple convolutions and downsampling, and performs upsampling and convolution after multiple downsampling, so as to restore to an output picture with the same size as the original picture, and identify a text block and a background area. For example, as shown in fig. 4A, a picture containing characters and a circuit structure is input into the neural network model, and after model processing, as shown in fig. 4B, character parts are identified to form character blocks with different sizes, and the circuit structure part becomes a background area to better identify each text. It should be noted that, in order to improve the precision, the network combines the convolution result before each downsampling into the data after upsampling, and this mechanism enhances the cutting precision on the details.

After the text block information is obtained through the neural network model, text lines are extracted, that is, information of each line of text in the text block information is extracted to obtain the text line information. Since a picture may have a certain inclination angle during scanning or shooting, so that the text lines or columns in the picture are inclined, before extracting line information, it is also necessary to determine the rotation direction of the text to perform angle correction of the text image to determine the positions of the text lines, and extract each line of text information separately based on the positions of the text lines.

In an alternative embodiment, the number of all foreground pixels in each row is calculated to obtain the projection of the pixels in the horizontal direction, and the projection is drawn to obtain the cumulative curve of the foreground pixels. As shown in fig. 5A, the wavy line represents a projection line of each line of text in the vertical direction, and the deviation of the projection line of each line of text in the vertical direction appears more obvious when the angle between the text line of the picture and the horizontal line is smaller. Therefore, the correction angle of the image can be determined only by finding the projection line with the maximum deviation of the text line in the vertical direction, namely, finding the minimum included angle between the text line and the horizontal line. After the image is adjusted to the horizontal state according to the correction angle, as shown in fig. 5B, the wavy line represents a projection line of each line of text in the horizontal direction, and the starting horizontal position of the first character and the horizontal ending position of the last character in each line are determined according to the projection information of the text in the horizontal direction. The position of each row may be defined by a rectangle (x)_k，y_k，w_k，h_k) Is represented by the formula (I) in which x_k、y_k、w_k、h_kRespectively representing the initial horizontal coordinate, the initial vertical coordinate, the line width and the line height of the k-th line. After the position of the text line is determined, each line of text information of the picture can be extracted separately, as shown in fig. 5C, each line of text is divided, and line cutting information is obtained.

And according to the information of the text lines, extracting single character information by comprehensively using a vertical projection method and a connected domain method, and determining the position of the single character. The vertical projection method is to calculate the density histogram of the black pixel in the vertical direction for each line, then to divide the character symbol according to the valley of the projection graph, the method can be divided at the blank area or the position with the slightest connection between two characters, for the mathematical expression with complex format, such as the expression with root, the connected domain method is adopted, the method can find the connected part in the image, and the combination of each part component is carried out according to the distance, thereby the basic part is combined into a complete character, such as Lu character, in the extraction process, the vertical projection method divides it into two ports, the connected domain method can combine two separated port characters, and the accuracy of the identification result is improved.

In an alternative embodiment, before character recognition, a training data set containing Chinese, English and mathematical symbols is required to be established, 6763 common Chinese characters are adopted for the Chinese part according to a Chinese character coding character set for information exchange, namely a basic set, issued in 1980, and cover most of the daily Chinese characters, wherein 3755 common Chinese characters are included in a primary character library, and 3008 common Chinese characters are included in a secondary character library. The English part comprises 26 letters, and the number of different characters is 52 in consideration of upper case and lower case. Besides, the construction of the training sample library requires common mathematical characters, which can be statistically obtained from K12 question library, as shown in FIG. 6. After the character set is determined, pictures with different sizes and different fonts can be rendered for each character and then stored as a training picture set, wherein each picture represents a sample of a character. In order to enlarge the samples, deformation processing such as perspective transformation and rotational transformation may be performed on each picture, thereby generating more samples per picture. In addition, a convolution kernel library simulating common blurring is constructed, and each picture is convolved to generate a blurred picture. After a series of processing, a training set of single character text pictures which can be used for deep learning is obtained.

The traditional identification method is based on the extraction of image features, such as SIFT, HoG and other features, and can describe the local form of a text image. These features can maintain relatively consistent results for certain degrees of image distortions such as image scaling, rotation, illumination adjustment, etc., so these operators can be used to identify text to some extent. However, under the conditions of large distortion, image blurring and low text resolution, the recognition rate of the character recognition based on the traditional characteristics is obviously lower than the resolution capability of human eyes.

In the embodiment of the invention, in order to improve the accuracy of text Recognition, the Character information is obtained by recognizing the single Character information through an Optical Character Recognition (OCR) model based on a convolutional neural network.

As shown in fig. 7, after the word information is obtained, the image containing the word information is scaled to have the same size, then the shallow layer feature is extracted through convolution, and then the length and width of the convolved image are reduced to 1/2 through down-sampling, so that the down-sampling process can achieve a certain degree of anti-deformation effect. And then carrying out convolution once again to extract deep features of the image, and then carrying out down sampling for the second time. And finally, obtaining the output of the final classification result through a full-connection network, namely outputting a corresponding prediction probability for each text, wherein the highest probability is the recognition result.

In the embodiment of the invention, the text block information is extracted from the whole page of text, the text line information is extracted based on the text block information, the single character information is extracted according to the text line information, the text information is identified once again in each extraction process, the text block information is extracted by adopting the convolutional neural network, the position of the line text is determined according to the projection of the line text information, the line text information is cut by a vertical projection method, and the basic parts are combined by utilizing a connected domain method, so that the combination of errors which possibly occur is reduced, and the accuracy of the identification result is greatly improved.

And S112, searching position information of a start character and position information of an end character of the character information.

The cutting of the question is essentially to find the positions of the starting character and the ending character of each question, obtain a large number of text results after the results are identified by a single character and a mathematical formula, and further refine the attributes of the identified results by combining text content analysis. Wherein a typical analysis task for the graphics appearing in the K12 education is to automatically locate the starting and ending positions of the questions, which helps to cut out the questions quickly and facilitate the quick construction of the school question bank.

In an alternative embodiment, the first few characters of the title are often the title number, which may be arabic numerals, english, chinese numerals or parentheses, and many other titles may have some common expressions after the title number, such as "known", "if", "some", etc. Based on the phenomenon, the first sentence of all the questions can be extracted from the question bank to obtain a text set which is specially used for identifying the beginning of the question, a second classification neural network is trained based on the text set to form a classification neural network model for identifying the beginning position of the question, and the classification neural network model is used for estimating whether each character is the beginning position of the question, if so, 1 is output, and if not, 0 is output. Some typical expressions also appear in the last characters of the title, such as "what? "," how many meters? And the like, extracting the last sentence of all the titles based on the title library to obtain a text set specially used for identifying the ending position of the title, and training a two-classification neural network based on the text set to form a classification neural network model for identifying the ending position of the title.

In an optional embodiment, on the basis of the above segmentation strategy, coordinate information of the text may be added to further optimize the cutting result. The beginning of each topic tends to appear at the first word of a line of text, i.e., the character is numerically different from its previous character in either the horizontal or vertical coordinate.

And S113, acquiring error question information according to the position information of the initial character and the position information of the end character.

And acquiring wrong question information based on the position information of the starting character and the position information of the ending character to separate different questions better and obtain single-channel questions. The cutting results are shown in FIG. 8.

In an optional embodiment, the identification result of each topic is input into a preset topic classification model, and the topic type is determined so as to facilitate subsequent topic recommendation. For example, after a whole set of test paper is segmented, multiple wrong questions are obtained, each wrong question is input into a question classification model, the wrong questions can be classified according to selection questions, blank filling questions, calculation questions, short answer questions or other types, when a certain question is determined to be a selection question, a question database can extract knowledge points of the wrong questions, selection questions related to the knowledge points are recommended firstly, if a user wants to strengthen application practice of the knowledge points on other types of questions, options corresponding to the type of the wanted exercise can be clicked, and the system can recommend the questions related to the knowledge points in the type according to selection of the user.

In an alternative embodiment, the blurred image is repaired before the image recognition to improve the accuracy of the image recognition, because the recognition rate of the image is reduced due to camera shake, mis-focus, image noise, image compression, low resolution, poor illumination or reflection. For general images, a blind convolution integral method is generally adopted for image restoration, but the method cannot fully utilize available knowledge of text images, so that the blind convolution integral method has certain defects on the quality of text image restoration, and particularly the text image contains a large amount of details, such as transverse lines, corners, circles and the like, which still have some traces in blurred pictures. In order to efficiently recover images, a convolutional neural network is adopted to train the fuzzy recovery of the images, a fuzzy recovery model is generated, and an input image is reconstructed on the basis of the model. Before training a convolutional neural network, a training set containing a large number of samples is needed, firstly, different clear text images are selected to form an original sample gallery; then, randomly intercepting pictures with the same size (for example, the length and the width are both 200 pixels) containing characters from the pictures; and finally, constructing a convolution kernel library simulating common blurring, and performing convolution on each picture to generate a blurred picture, so as to obtain a training set containing 1000 ten thousand pictures.

In order to simulate common fuzzy scenes in real scenes, out-of-focus blur, motion blur, ghosting, salt and pepper noise, dim light and other conditions which influence the image definition degree are mainly considered. Meanwhile, in the case of motion blur, motion that may occur in the camera at various angles is considered, and blur of the simulated image in various directions is considered. In order to adapt to image texts with different sizes, the scheme extracts texts from a font No. 5 to a font No. 100, and the step enables the final model to support the texts with common font sizes. Through the preparation of the three steps, the simulated image is enough to cover various scenes which influence the resolvable degree of the picture and appear in the scene shot by the primary and secondary school students by using the mobile phone. As shown in fig. 9A, a blurred image is used as an input sample, a corresponding sharp image is used as a prediction result, and a convolutional neural network is trained to obtain a corresponding model. For example, for a blurred text image as shown in fig. 9B, the reconstructed image is as shown in fig. 9C. The reconstructed image can more clearly represent the character form of the original image, and a large number of texts which cannot be recognized by human eyes in the original image can be clearly seen.

In the embodiment of the application, the image restoration processing based on deep learning is performed aiming at the conditions of low resolution, high blur, low illumination and the like of the photographing environment, and the pictures generated in the learning environment of the student at home or at school are processed in a key way, so that a large number of text images which are difficult to recognize in the traditional OCR become clear and recognizable, and the accuracy of recognition, positioning and knowledge point analysis of wrong questions is greatly improved.

After the recognition result is obtained, the recognition result needs to be processed by natural language, and the natural language is changed into machine language, so that further analysis and processing are facilitated. In the past decades, Neuro-Linguistic Programming (NLP) technology is continuously developed, but in some fields, major breakthrough is difficult to achieve all the time, and in recent years, various deep learning skills appear, so that the performance of the tasks of classification, matching, translation and structured prediction in the NLP in the deep learning method is superior to or obviously superior to that of the traditional method.

The Chinese word segmentation is the foundation of text mining, and for a section of input Chinese, the Chinese word segmentation is successfully carried out, so that the effect of automatically identifying the meaning of a sentence by a machine can be achieved. Two general categories of Chinese word segmentation technology are available: mechanical word segmentation technology and sequence labeling technology based on statistics. The mechanical word segmentation means the word segmentation by means of the input dictionary, which is relatively simple and convenient to operate and is more worry-free, but the mechanical word segmentation has the defect that the dictionary has certain hysteresis and the word segmentation effect on new words, ambiguous words and unknown words is not particularly ideal. In order to solve the problem of ambiguous word segmentation, many optimized methods for mechanical word segmentation also exist, and the common methods comprise forward maximum matching, reverse maximum matching, minimum word segmentation results, path selection after full segmentation and the like.

Aiming at the problems of mechanical word segmentation, particularly the segmentation of unrecorded words in a dictionary, a better processing result can be obtained by using a sequence labeling method based on statistics, and the method is simply a sequence labeling problem and has the characteristics that Chinese and English are not distinguished, the front and back sequences of a language are mainly concerned, and common models mainly comprise a Hidden Markov Model (HMM) and a Conditional Random Field (CRF).

The basic idea of HMM is to find a true hidden state value sequence from the observation sequence. In chinese participles, each character of a segment of text can be considered an observation, and the word position tag of the character can be considered a hidden state. Using the word segmentation of HMM, by performing statistics on the segmented corpus, 5 major elements in the model can be obtained: the method comprises the following steps of starting probability matrix, transition probability matrix, emission probability matrix, observation value set and state value set. In the probability matrix, the initial probability matrix represents the probability of the first state value of the sequence, and in chinese participle, the probabilities of M and E are theoretically 0. Transition probabilities represent probabilities between states, such as the probability of B- > M, the probability of E- > S, and so on. And the emission probability is a conditional probability which represents the probability that a word appears in the current state, for example, p (human | B) represents the probability of a herringbone in the case that the state is B. With three matrices and two sets, the HMM problem finally translates into a problem that solves for the maximum of the hidden state sequence.

Conditional Random Fields (CRFs) are probabilistic structural models used for labeling and dividing structural data, and are generally used in pattern recognition and machine learning, and are widely applied in the fields of natural language processing, image processing and the like. Similar to the HMM, when CRF describes a model by defining a conditional probability P (Y | X), rather than a joint probability distribution P (X, Y), for a given input observation sequence X and output sequence Y.

For the sequence problem, one model in deep learning that is particularly suitable for dealing with such problems is the Long Short-Term Memory network (LSTM), which is an extension of RNN and is specifically designed to avoid Long-Term dependency problems. The repetitive neural network modules of LSTM have a different structure, unlike the naive RNN network, there are 4 neural network layers that interact in a special way. The key to LSTM is the cellular state, somewhat akin to a conveyor belt. In LSTM, information is added or deleted to the cell state by gate structures, which are ways of selectively letting information through, typically consisting of a sigmoid neural network layer and point-by-point product operations (the output of the sigmoid layer is between 0 and 1, defining the extent to which information is passed, 0 indicating nothing but 1 indicating all let). The LSTM has three gate structures, an input gate, a forgetting gate and an output gate, to maintain and update the cell state. In the Chinese word segmentation task, the LSTM memory unit inputs Chinese characters from a context window. For each Chinese character c (t), the input of the LSTM memory unit is X (t), and the LSTM memory unit is formed by connecting upper and lower character embedding (c (t-k), …, c (t), …, c (t + k)), wherein k represents the distance from the current character. The output of the LSTM unit is used for label reasoning function after linear transformation, and the label corresponding to the Chinese character is deduced.

When two sentence bags are identical, if the grammatical structure of the sentence is not considered, the sentence may be expressed in many ways, for example, "5 x +6 equals 0" and "5 +6x equals 0", and the word segmentation results are the same, but the expressed meanings are opposite. In this case, we can know what the true object the user asks is through syntactic analysis.

We divide the syntactic analysis into three parts: the three deconstructions are a bi-Long Short-Term Memory-recurrent neural network (bi-LSTM) model which is a variety of the LSTM, and experiments show that the bi-LSTM can make up the gap between a sequence and a tree to obtain a better deconstruction result.

In an optional embodiment, screening a recommended topic corresponding to wrong topic information from a preset topic database includes: calculating the similarity between the wrong question information and the questions in the question database to obtain the questions to be recommended corresponding to the wrong question information; and calculating the importance of the questions to be recommended, and screening the front N recommended questions corresponding to the wrong question information according to the arrangement sequence of the importance from high to low.

Specifically, a large amount of topic information is obtained through the process, certain relations exist among the topic information, the relations among the topics can be obtained through different processing, such as similarity and importance, and the relevance can be applied to multiple education scenes based on the labels.

The traditional similarity calculation mode is calculated through cosine similarity, firstly, the topic is segmented, and then the topic key words are obtained through calculating TF-IDF, wherein TF and IDF are word frequency and inverse document frequency respectively, and TF and IDF are respectively expressed by the following formulas:

TF-IDF ═ word frequency (TF) x Inverse Document Frequency (IDF)

Then, converting the word frequency set into a vector, and finally, calculating a vector included angle through a cosine theorem to represent the similarity of two questions, wherein the cosine theorem is as follows:

however, such a similarity calculation method has a certain disadvantage that it can only achieve text similarity but cannot do Semantic level similarity, and therefore we adopt a Deep Semantic matching model (DSSM).

The DSSM is an implicit semantic model with a multilayer neural network, which is used for calculating the similarity of documents and keywords by injecting the documents and the keywords into a low-dimensional space, training data adopted by the DSSM are questions actually recommended by teachers, and training and learning are performed by maximizing the conditional probability of the questions recommended in the training data, so that the defect of semantic mismatching in the traditional similarity calculation method is overcome.

The structure of the DSSM neural network is shown in FIG. 10, a Word Hashing algorithm of a first hidden layer is a dimension reduction measure adopted by the DSSM aiming at overlarge data dimension, however, for Chinese, the Word Hashing method greatly increases the data dimension, and therefore a Word2vec Word vector model is adopted to construct an input vector.

Training the DSSM neural network according to a large amount of data of various topics to obtain a model with the accuracy rate of more than 0.99, inputting all the topics in a current topic database into the model for prediction, and storing the prediction result into a graph database, wherein a training change curve of the accuracy rate is shown in FIG. 11.

According to the similarity between the knowledge points, the importance of the knowledge points is obtained through a PageRank algorithm, and the algorithm formula is as follows:

A knowledge graph is essentially a semantic network, and is a graph-based data structure, consisting of nodes and edges. In the knowledge graph, each node may represent a student, a topic, a knowledge point, etc., and each edge is a "relationship" between nodes, such as a relationship between a student and a topic, and a relationship between a topic and a knowledge point. Knowledge-graphs are the most efficient way to represent relationships. Generally, a knowledge graph is a relational network obtained by connecting all different kinds of information together. Knowledge-graphs provide the ability to analyze problems from a "relational" perspective.

The knowledge graph is constructed based on a database of graphs, in the embodiment of the invention, a Neo4j graph database which is a high-performance NoSQL graph database is adopted in technical selection,

the most basic concept in Neo4j is node and relationship, where a node represents an entity, and there may be different relationships between two nodes, and different attributes may be added to either points or relationships.

Neo4j has many advantages, it can easily represent the connected data, it can easily form a huge data support through three parts of node relation attributes, it is very easy and fast to retrieve/traverse/navigate more connected data, it can make millisecond-level query for the relation under the current situation of billions of nodes and relations, Neo4j CQL query language command is a humanized readable format, it is very easy to learn, single machine supports billions of data storage.

A topic map is constructed based on a Neo4j map database, as shown in FIG. 12A, not only are there mutual relations between topics, but also a topic-knowledge point-student relation network is constructed through data of students for making topics, so that the mastering conditions of the students on the knowledge points can be obtained, the relations between the topics are the calculated mastering degrees of the knowledge points, and the mastering conditions of the students on the knowledge points can be visually displayed.

In an optional embodiment, the topic map can recommend the topics of the students according to wrong topics of the students, a recommendation degree storage form is arranged between the topics as shown in 12B, each node is a topic, the edge between the nodes is the recommendation degree between any two topics, and the first few topics with higher recommendation degrees are selected to be pushed. The recommendation degree is obtained according to the similarity and the importance degree of the questions, the question map screens out the questions of the same type as the wrong questions according to the similarity between the wrong questions and the questions in the question database to obtain the questions to be recommended, then the importance degree of the questions to be recommended is calculated, the previous N recommended questions are screened in the sequence from high to low, N can be the number of the recommended questions selected from the questions to be recommended according to percentage, and can also be the fixed number of the recommended questions selected from the questions to be recommended. For example, according to the similarity between the titles, 20 titles to be recommended are screened, the importance of the 20 titles to be recommended is calculated, the importance of the 20 titles is arranged from high to low, 20% of the titles to be recommended before screening can be set as recommended titles, that is, the 4% of the titles to be recommended are selected as recommended titles to be pushed, and can also be set to be any percentage of 30% or 60%, which is not limited herein. In addition, the first 5 of the 20 to-be-recommended questions may be selected as recommended questions to be recommended, or any integer such as 6 or 7 may be set, which is not limited herein, and it should be noted that the selection of N cannot be greater than the number of the to-be-recommended questions.

Knowledge tracking is a computer-aided education problem that can reduce the cost of student learning. When a student learns a set of courses, the knowledge system of the student can be modeled by tracking the learning path of the student, so that the future performance of the student can be accurately predicted. Knowledge tracking can be based on the individual knowledge system of students, neglect problems which are too simple or too difficult for the students, and recommend personalized course content for the students. Knowledge tracking is a very difficult problem due to the complexity of the human brain and knowledge, and therefore it needs to be implemented using complex models. Most of the existing work in the education field is based on a Markov model and can only be applied to limited functional scenes.

In an alternative embodiment, the cyclic neural network-based deep knowledge tracking model solves the knowledge tracking problem with a sufficiently long time series. The model uses a neural network with a large number of nodes to learn and abstract potential features of knowledge from data. The basic structure of the recurrent neural network is to store the output of the network in a memory unit, the memory unit and the next input enter the neural network together, a simple two-layer network is used as a demonstration, and the structure of the recurrent neural network is expanded on the basis of the simple two-layer network.

The problem of knowledge tracking can be described as follows: knowing the sequence of student's actions x in a particular learning task₀……x_tPredicting his next behavior x_t+1. In the most common knowledge tracking, a behavior is usually represented as a tuple, shaped as x_t＝{q_t，a_tWherein q is_tLabels representing exercises, a_tRepresenting whether the exercise answered correctly. Giving labels q of exercises when making predictions_tPredicting whether the exercise answered correctly_t。

As shown in FIG. 13, h₀Representing an initial state by computing a hidden sequence h₁，…，h_TWill input a sequence x₁，…，x_TMapped as output sequence y₁，…，y_TThe concealment sequence can be regarded as an encoding of past behavior related information for predicting future behavior. Variables in the network are defined as follows:

h_t＝tanh(W_hxx_t+W_hhh_t-1+b_h)，

y_t＝δ(W_yhh_t+b_y)，

input (x) of dynamic network_t) Is a real vector representation of student behavior, prediction (y)_t) Is a vector of the probability of answering each question correctly.

Wherein tanh and sigmoid functions δ (-) need to be applied for each term, and the parameters of the model include the input weight matrix W_hxThe cyclic weight matrix W_hhT th hidden sequence h_tT-1 th hidden sequence h_t-1Output weight matrix W_yhThe bias of the hidden layer and the output layer is b_hAnd b_y。

In an optional embodiment, wrong knowledge points of the students are input and trained through a recurrent neural network, weak knowledge points of the students are modeled, and a deep knowledge tracking model is obtained. The student inputs the wrong questions in the auxiliary books, test papers and post-lesson assignments into the system through photographing or scanning, or carries out wrong question enhancement training in the system to generate new wrong questions, the wrong questions in the system are used as input sequences of a deep knowledge tracking model, and the future behavior of the student is predicted by calculating the similarity and the importance of a hidden sequence, namely calculating the similarity and the importance of historical wrong questions. For example, a student wrongly makes a question about the cosine theorem, inputs the wrong question into a system by photographing, screens the question related to the cosine theorem from a question database, pushes the question to a student terminal, stores the wrong question record in the system, and inputs the cosine theorem into a recurrent neural network for continuous training to obtain a depth knowledge tracking model containing the cosine theorem. The user practice system pushes questions to the terminal, when some questions are wrong again, the deep knowledge tracking model analyzes according to the question types of the some wrong questions and the cosine theorem application mode, the questions are listed as a key strengthened question list, according to wrong question records, the system can recommend the questions containing the cosine theorem in course chapters learned by the user in the future in a key mode, and the user can be strengthened in mastering the cosine theorem.

In an optional embodiment, the system is combined with a national mobile phone network of an operator, collects the wrong questions of students nationwide systematically, and detects the summary situation of related knowledge points of different topics in each region, wherein the summary situation comprises the weight summary of related knowledge points of wrong questions of students in a certain stage, and can be refined to the condition of related knowledge points of students in a certain school year, a certain time point, and even a certain city, such as students in a certain school, that is, students in middle school in Zhejiang province, and can collect the weight distribution of related knowledge points of wrong questions in the first school year in 2019 in the junior middle school in Zhejiang province according to the wrong questions of daily training, clearly know the most error-prone knowledge points in each stage of students in Zhejiang province, and eight-year classmates can collect the wrong knowledge points of students in the first school year in multiple cities in Hangzhou, Jinhua, Ningbo and the like in Zhejiang province, and know the easy-to wrong knowledge points of students in the same grade in Zhejiang province in the first school, is favorable for comprehensively mastering the learned knowledge. Teaching centers of gravity of different areas are not necessarily the same, wrong questions of students are distributed differently, and wrong question information of a student group is collected more comprehensively than that of a single student. Compared with the previous self-adaptive learning analysis technology, the self-adaptive learning of the embodiment of the application enables students to combine with other wrong questions met by students in the same age group and the same region to carry out strengthening exercise, and avoids the battle of the orphan army.

In the embodiment of the invention, the character information corresponding to the wrong question identification in the picture is identified, the position information of the initial character and the position information of the end character of the character information are searched, the wrong question information is obtained according to the position information of the initial character and the position information of the end character, the accuracy of wrong question identification and positioning is improved, the cutting precision of the segmentation of the wrong questions is improved, the cutting of the wrong questions is facilitated to be fast and accurately, the recommended questions corresponding to the wrong question information are screened from the preset question database, and the recommended questions are pushed to the terminal of the user, so that the user can fast, conveniently and accurately obtain a large number of questions related to wrong question knowledge points.

Fig. 14 is a schematic structural diagram of an apparatus according to an embodiment of the present invention. As shown in fig. 14, the apparatus may include an obtaining module 210, a filtering module 220, and a recommending module 230, wherein:

the obtaining module 210 is configured to obtain wrong question information corresponding to a wrong question identifier in a picture;

the screening module 220 is configured to screen a recommended question corresponding to the wrong question information from a preset question database;

and the recommending module 230 is configured to push the recommendation topic to the terminal of the user.

In an alternative embodiment, as shown in fig. 15, the obtaining module 210 includes:

the recognition module 211 is configured to recognize character information corresponding to the wrong question identifier in the picture;

a positioning module 212, configured to find position information of a start character and position information of an end character of the character information;

and the extracting module 213 is configured to obtain error problem information according to the position information of the start character and the position information of the end character.

In an optional embodiment, the recognition module 211 is specifically configured to extract text block information in the picture, extract text line information based on the text block information, extract single character information based on the text line information, and recognize the single character information through an optical character recognition model to obtain character information.

In an optional embodiment, the screening module 220 is specifically configured to calculate a similarity between the wrong-question information and the questions in the question database, obtain the to-be-recommended questions corresponding to the wrong-question information, calculate the importance of the to-be-recommended questions, and screen the first N recommended questions corresponding to the wrong-question information according to an arrangement order of the importance from high to low.

In an alternative embodiment, the filtering module 220 is further configured to calculate the similarity based on the deep semantic matching model.

In an alternative embodiment, the formula for calculating the importance in the filtering module 220 is:

Each module/unit in the apparatus shown in fig. 14 has a function of implementing each step in fig. 1, and can achieve the corresponding technical effect, and for brevity, the description is not repeated here.

Fig. 16 is a schematic diagram illustrating a hardware structure of an intelligent recommendation device according to an embodiment of the present invention.

The intelligent recommendation device may include a processor 1601 and a memory 1602 storing computer program instructions.

Specifically, the processor 1601 may include a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits configured to implement the embodiments of the present invention.

Memory 1602 may include mass storage for data or instructions. By way of example, and not limitation, memory 1602 may include a Hard Disk Drive (HDD), a floppy Disk Drive, flash memory, an optical Disk, a magneto-optical Disk, tape, or a Universal Serial Bus (USB) Drive or a combination of two or more of these. In one example, memory 1602 may include removable or non-removable (or fixed) media, or memory 1602 is non-volatile solid-state memory. The memory 1602 may be internal or external to the integrated gateway disaster recovery device.

In one example, the Memory 1602 may be a Read Only Memory (ROM). In one example, the ROM may be mask programmed ROM, programmable ROM (prom), erasable prom (eprom), electrically erasable prom (eeprom), electrically rewritable ROM (earom), or flash memory, or a combination of two or more of these.

The processor 1601 reads and executes the computer program instructions stored in the memory 1602 to implement the methods/steps S110 to S130 in the embodiment shown in fig. 1, and achieve the corresponding technical effects achieved by the embodiment shown in fig. 1 executing the methods/steps thereof, which are not described herein again for brevity.

In one example, the intelligent recommendation device may also include a communication interface 1603 and a bus 1610. As shown in fig. 16, the processor 1601, the memory 1602, and the communication interface 1603 are connected via a bus 1610 to complete communication with each other.

Communication interface 1603 is mainly used for implementing communication among modules, apparatuses, units and/or devices in the embodiment of the invention.

Bus 1610 includes hardware, software, or both to couple the components of the online data traffic billing device to each other. By way of example, and not limitation, a Bus may include an Accelerated Graphics Port (AGP) or other Graphics Bus, an Enhanced Industry Standard Architecture (EISA) Bus, a Front-Side Bus (Front Side Bus, FSB), a Hyper Transport (HT) interconnect, an Industry Standard Architecture (ISA) Bus, an infiniband interconnect, a Low Pin Count (LPC) Bus, a memory Bus, a Micro Channel Architecture (MCA) Bus, a Peripheral Component Interconnect (PCI) Bus, a PCI-Express (PCI-X) Bus, a Serial Advanced Technology Attachment (SATA) Bus, a video electronics standards association local (VLB) Bus, or other suitable Bus or a combination of two or more of these. Bus 1610 may include one or more buses, where appropriate. Although specific buses have been described and shown in the embodiments of the invention, any suitable buses or interconnects are contemplated by the invention.

The intelligent recommendation device can execute the intelligent recommendation method in the embodiment of the invention based on wrong-question pictures in daily exercises such as textbooks, teaching and assistant books, test papers, post-lesson operations and the like, thereby realizing the intelligent recommendation method described in combination with fig. 1.

In addition, in combination with the intelligent recommendation method in the foregoing embodiments, the embodiments of the present invention may be implemented by providing a computer storage medium. The computer storage medium having computer program instructions stored thereon; the computer program instructions, when executed by a processor, implement any of the intelligent recommendation methods in the above embodiments.

It is to be understood that the invention is not limited to the specific arrangements and instrumentality described above and shown in the drawings. A detailed description of known methods is omitted herein for the sake of brevity. In the above embodiments, several specific steps are described and shown as examples. However, the method processes of the present invention are not limited to the specific steps described and illustrated, and those skilled in the art can make various changes, modifications and additions or change the order between the steps after comprehending the spirit of the present invention.

The functional blocks shown in the above-described structural block diagrams may be implemented as hardware, software, firmware, or a combination thereof. When implemented in hardware, it may be, for example, an electronic Circuit, an Application Specific Integrated Circuit (ASIC), suitable firmware, plug-in, function card, or the like. When implemented in software, the elements of the invention are the programs or code segments used to perform the required tasks. The program or code segments may be stored in a machine-readable medium or transmitted by a data signal carried in a carrier wave over a transmission medium or a communication link. A "machine-readable medium" may include any medium that can store or transfer information. Examples of a machine-readable medium include electronic circuits, semiconductor memory devices, ROM, flash memory, Erasable ROM (EROM), floppy disks, CD-ROMs, optical disks, hard disks, fiber optic media, Radio Frequency (RF) links, and so forth. The code segments may be downloaded via computer networks such as the internet, intranet, etc.

It should also be noted that the exemplary embodiments mentioned in this patent describe some methods or systems based on a series of steps or devices. However, the present invention is not limited to the order of the above-described steps, that is, the steps may be performed in the order mentioned in the embodiments, may be performed in an order different from the order in the embodiments, or may be performed simultaneously.

Aspects of the present disclosure are described above with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, enable the implementation of the functions/acts specified in the flowchart and/or block diagram block or blocks. Such a processor may be, but is not limited to, a general purpose processor, a special purpose processor, an application specific processor, or a field programmable logic circuit. It will also be understood that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware for performing the specified functions or acts, or combinations of special purpose hardware and computer instructions.

As described above, only the specific embodiments of the present invention are provided, and it can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the system, the module and the unit described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. It should be understood that the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the present invention, and these modifications or substitutions should be covered within the scope of the present invention.

Claims

1. A method of intelligent recommendation, comprising:

screening a recommended question corresponding to the wrong question information from a preset question database;

and pushing the recommended topic to a terminal of a user.

2. The method according to claim 1, wherein the obtaining of the wrong question information corresponding to the wrong question identifier in the picture comprises:

identifying character information corresponding to the wrong question mark in the picture;

and acquiring the wrong question information according to the position information of the starting character and the position information of the ending character.

3. The method according to claim 2, wherein the identifying the character information corresponding to the wrong question mark in the picture comprises:

extracting text block information in the picture;

extracting text line information based on the text block information;

extracting single character information based on the text line information;

4. The method according to claim 1, wherein the screening of the recommended topic corresponding to the wrong topic information from a preset topic database comprises:

and calculating the importance of the questions to be recommended, and screening the front N questions corresponding to the wrong question information according to the arrangement sequence of the importance from high to low.

5. The method of claim 4, wherein the similarity is computed based on a deep semantic matching model.

6. The method according to claim 4, wherein the importance is calculated by the formula:

7. An intelligent recommendation device, the device comprising:

the screening module is used for screening a recommended question corresponding to the wrong question information from a preset question database;

and the recommending module is used for pushing the recommending question to a terminal of a user.

8. The apparatus of claim 7, wherein the obtaining module comprises:

and the extraction module is used for acquiring the wrong question information according to the position information of the starting character and the position information of the ending character.

9. An intelligent recommendation device, characterized in that the device comprises: a processor, and a memory storing computer program instructions; the processor reads and executes the computer program instructions to implement the intelligent recommendation method of any one of claims 1-6.

10. A computer storage medium having computer program instructions stored thereon, which when executed by a processor implement the intelligent recommendation method of any one of claims 1-6.