CN111553151A

CN111553151A - Question recommendation method and device based on field similarity calculation and server

Info

Publication number: CN111553151A
Application number: CN202010255040.6A
Authority: CN
Inventors: 赵亮
Original assignee: OneConnect Financial Technology Co Ltd Shanghai
Current assignee: OneConnect Financial Technology Co Ltd Shanghai
Priority date: 2020-04-02
Filing date: 2020-04-02
Publication date: 2020-08-18
Also published as: WO2021196934A1

Abstract

The application is suitable for the technical field of computers, and provides a problem recommendation method and device based on field similarity calculation, a storage medium and a server. The question recommendation method comprises the following steps: acquiring an input first question sentence; performing word segmentation processing on the first question sentence, and extracting each field contained in the first question sentence; comparing the fields with fields in a field data table which is constructed in advance one by one, finding out the fields which are the same as the fields in the field data table, and determining the fields as target fields; respectively calculating the similarity between the target field and each other field except the target field in the field data table; and selecting the field with the highest similarity in the other fields, and replacing the target field in the first question sentence to obtain the recommended second question sentence. By adopting the question recommending method, new question sentences which are more in line with the expectation of the user can be generated, and the question recommending accuracy of the intelligent question answering system is improved.

Description

Question recommendation method and device based on field similarity calculation and server

Technical Field

The application belongs to the technical field of computers, and particularly relates to a problem recommendation method and device based on field similarity calculation, a storage medium and a server.

Background

The working principle of an intelligent question-answering system based on natural language is that a user inputs a question, the intelligent question-answering system carries out natural language processing on the question to generate a structured query language, then the content of a response is searched in a database or a knowledge base according to the structured query language, and finally a query result is returned to the user.

At present, two main problem recommendation modes of an intelligent question-answering system are available, wherein one mode is real-time recommendation, namely, the real-time recommendation is carried out according to a question currently input by a user; the other is a similar problem recommendation. When real-time recommendation is performed, keyword triggering is often performed, for example, when a user inputs "by", an enumerated field name is recommended; and on the similar question recommendation, the same type of keywords in the original question are randomly replaced, so that a new question is spelled. However, the problem of the above two recommendation methods is often far from the expectation of the user, and the accuracy of the problem recommendation is low.

Disclosure of Invention

In view of this, the present application provides a question recommendation method, device, storage medium, and server based on field similarity calculation, which can improve the precision of the intelligent question-answering system in recommending questions.

In a first aspect, an embodiment of the present application provides a problem recommendation method based on field similarity calculation, including:

acquiring an input first question sentence;

performing word segmentation processing on the first question sentence, and extracting each field contained in the first question sentence;

comparing the fields with fields in a field data table which is constructed in advance one by one, finding out the fields which are the same as the fields in the field data table, and determining the fields as target fields;

respectively calculating the similarity between the target field and each other field except the target field in the field data table;

and selecting the field with the highest similarity in the other fields, and replacing the target field in the first question sentence to obtain the recommended second question sentence.

Further, the similarity between the target field and any other field in the field data table can be calculated by the following steps:

calculating a similarity index of the target field and any one of the other fields by combining the character string and the enumerated value of the target field and the character string and the enumerated value of any one of the other fields, wherein the similarity index is a parameter for measuring the similarity between the two fields;

and calculating the similarity between the target field and any one of the other fields according to the similarity indexes of the target field and any one of the other fields.

Further, the calculating the similarity indicator between the target field and any other field may include:

calculating a character string similarity index, a character string length similarity index, an enumerated value number similarity index and an enumerated value length similarity index of the target field and any other field;

the calculating the similarity between the target field and the any one of the other fields according to the similarity indicator between the target field and the any one of the other fields may include:

and calculating an average value or a weighted average value of the character string similarity index, the character string length similarity index, the enumerated value number similarity index and the enumerated value length similarity index as the similarity of the target field and any other field.

Further, the character string similarity index may be calculated by the following formula:

wherein s is₁Representing the character string similarity index, sim representing the number of identical character strings of the two fields, short representing the length of the character string of the field with shorter length of the two fields, and long representing the length of the character string of the field with shorter length of the two fieldsThe long field has a string length, α is a super parameter, used to control the impact of the string on the similarity;

the string length similarity index may be calculated using the following formula:

wherein s is₂Representing the character string length similarity index, wherein short represents the character string length of the field with the shorter length in the two fields, and long represents the character string length of the field with the longer length in the two fields;

the enumerated value number similarity index may be calculated by the following formula:

wherein s is₃Representing the similarity index of the enumeration value numbers, wherein min represents the enumeration value number of a field with a smaller enumeration value number in the two fields, and max represents the enumeration value number of a field with a larger enumeration value number in the two fields;

the enumerated value length similarity index may be calculated using the following formula:

wherein s is₄And indicating the enumeration value length similarity index, wherein avg _ min indicates the average length of the enumeration values of the fields with shorter average lengths of the enumeration values in the two fields, and avg _ max indicates the average length of the enumeration values of the fields with longer average lengths of the enumeration values in the two fields.

Further, the separately calculating the similarity between the target field and each of the other fields in the field data table except the target field may include:

searching all historical question sentences of the user who inputs the first question sentence;

constructing a co-occurrence matrix according to the historical question sentences, wherein the co-occurrence matrix records the times of the common occurrence of any two fields in the field data table in the same historical question sentences of the user;

and calculating the similarity between the target field and each other field according to the co-occurrence matrix.

Further, the determining the similarity between the target field and each of the other fields according to the co-occurrence matrix may include:

extracting field vectors of the target fields and field vectors of each other field from the co-occurrence matrix respectively, wherein each element of the field vectors is the times of the common occurrence of the corresponding field and each field in the field data table in the same historical question sentence of the user;

and respectively calculating cosine similarity between the field vector of the target field and the field vectors of each other field to obtain the similarity between the target field and each other field.

Further, after constructing the co-occurrence matrix according to the historical question statement, the method may further include:

determining a field with the most times which appears in the same historical question sentence of the user together with the target field in the field data table according to the co-occurrence matrix;

and selecting the field with the most times, and replacing the target field in the first question sentence to obtain a recommended third question sentence.

In a second aspect, an embodiment of the present application provides a question recommendation device based on field similarity calculation, including:

the question acquisition module is used for acquiring an input first question sentence;

the word segmentation module is used for carrying out word segmentation on the first question sentence and extracting each field contained in the first question sentence;

the field comparison module is used for comparing each field with fields in a field data table which is constructed in advance one by one, finding out the same fields of each field and the field data table and determining the same fields as target fields;

the field similarity calculation module is used for calculating the similarity between the target field and each other field except the target field in the field data table;

and the question recommending module is used for selecting the field with the highest similarity in the other fields, and replacing the target field in the first question sentence to obtain a recommended second question sentence.

In a third aspect, an embodiment of the present application provides a computer-readable storage medium, where a computer program is stored, and the computer program, when executed by a processor, implements the steps of the problem recommendation method as set forth in the first aspect of the embodiment of the present application.

In a fourth aspect, an embodiment of the present application provides a server, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement the steps of the problem recommendation method as set forth in the first aspect of the embodiment of the present application.

In a fifth aspect, an embodiment of the present application provides a computer program product, which, when running on a terminal device, causes the terminal device to execute the steps of the problem recommendation method according to the first aspect.

According to the problem recommendation method based on field similarity calculation, after each field of an input question sentence is extracted, each field is compared with fields in a field data table constructed in advance one by one, the same field in the extracted field and the field data table is found out, and the field is determined to be a target field; then, the similarity between the target field and each other field in the field data table is respectively calculated, the field with the highest similarity is found out, and the target field in the question sentence is replaced, so that the recommended question sentence is obtained. Compared with a conventional mode of randomly replacing keywords of the same type in sentences, the method and the device comprehensively consider the similarity among all preset fields, replace the fields in the original question sentences by the fields with the highest similarity, can generate new question sentences which are more in line with the expectation of users, and improve the precision of the questions recommended by the intelligent question-answering system.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.

FIG. 1 is a flowchart of a first embodiment of a problem recommendation method provided by an embodiment of the present application;

FIG. 2 is a flow chart of a second embodiment of a problem recommendation method provided by an embodiment of the present application;

FIG. 3 is a flowchart of a third embodiment of a question recommendation method provided by an embodiment of the present application;

FIG. 4 is a block diagram of an embodiment of a problem recommendation device provided by an embodiment of the present application;

fig. 5 is a schematic diagram of a server according to an embodiment of the present application.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail. Furthermore, in the description of the present application and the appended claims, the terms "first," "second," "third," and the like are used for distinguishing between descriptions and not necessarily for describing or implying relative importance.

The application provides a question recommendation method, a question recommendation device, a storage medium and a server, which can improve the precision of question recommendation of an intelligent question-answering system.

It should be understood that the subject of the problem recommendation method based on field similarity calculation proposed in the embodiments of the present application is various types of servers or terminal devices.

Referring to fig. 1, a first embodiment of a method for recommending a question based on field similarity calculation in an embodiment of the present application includes:

101. acquiring an input first question sentence;

the user can input the question to be asked, namely the first question sentence, on the terminal device through voice input or manually, and the question sentence is sent to the intelligent question-answering system at the server side.

102. Performing word segmentation processing on the first question sentence, and extracting each field contained in the first question sentence;

after the server obtains the question sentence, the server divides the question sentence into words and extracts each field contained in the question sentence. During word segmentation, various different types of word segmentation methods in the prior art can be adopted, for example, jieba word segmentation can be adopted, if the user proposes the following problems: "how different occupational average ages are for men? ", then after using jieba segmentation, the fields list [" male "," different "," professional "," average "," age "," how ","? "].

103. Comparing the fields with fields in a field data table which is constructed in advance one by one, finding out the fields which are the same as the fields in the field data table, and determining the fields as target fields;

after obtaining each field in the first question sentence by word segmentation, the server compares each field with fields in a field data table constructed in advance one by one, finds out the same field of each field and the field data table, and determines the same field as a target field.

The pre-constructed field data table may be as shown in table 1 below:

TABLE 1

Name (I)	Occupation of the world	Sex	Age (age)	Personal income after tax and month	Industry
						Zhang three	Policeman	For male	35	4500	Security protection
Li four	Waiter	Woman	29	4000	Service
						…	…	…	…	…	…

In table 1, "name", "occupation", "sex", "age", "personal income after tax, and" industry "are all fields of the field data table, and" zhang san "," lie si "," waiter "," police "," male "," female "," security "," service ", and the like are enumerated values of the fields. When constructing a field data table, the above fields and enumerated values are written into a data structure, for example, in python language, the above data may be stored in a ditt type, forming a ditt type data structure table.

In addition, the fields can be added into a custom dictionary of the jieba, so that the field keywords are not cut open when the words of the question input by the user are segmented. For example, for the field keyword "personal monthly income", jieba will by default cut it into "personal", "monthly income", 3 fields, whereas jieba will not cut it if "personal monthly income" is added to the jieba's custom dictionary.

Let the various fields be list [ "male", "different", "professional", "average", "age", "how", "? "], these fields are compared with the respective fields in Table 1 to find the same fields as" occupation "and" age "as target fields. Here, one or more target fields may be used.

104. Respectively calculating the similarity between the target field and each other field except the target field in the field data table;

after determining the target field, respectively calculating the similarity between the target field and each other field except the target field in the field data table. For example, in the example of table 1, for the target field "occupation", the similarity between "occupation" and "name", the similarity between "occupation" and "gender", the similarity between "occupation" and "age", the similarity between "occupation" and "personal tax monthly income", and the similarity between "occupation" and "industry" are calculated.

(1) calculating a similarity index of the target field and any one of the other fields by combining the character string and the enumerated value of the target field and the character string and the enumerated value of any one of the other fields, wherein the similarity index is a parameter for measuring the similarity between the two fields;

(2) and calculating the similarity between the target field and any one of the other fields according to the similarity indexes of the target field and any one of the other fields.

Related attribute parameters of strings and enumerated values, such as the length of the string, or the number and class of enumerated values, are important parameters that may be used to determine the degree of similarity between fields. Further, the calculating the similarity indicator between the target field and any other field may include: and calculating a character string similarity index, a character string length similarity index, an enumerated value number similarity index and an enumerated value length similarity index of the target field and any other field.

Specifically, the character string similarity index may be calculated by using the following formula:

wherein s is₁Indicating the string similarity index, sim indicating the number of identical strings in both fields (i.e. the target field and the any other field), short indicating the length of the string in the shorter of the two fields, long indicating the length of the string in the longer of the two fields, α being a hyper-parameter for controlling the impact of the string on similarity.

Has the effect of converting s₁The compression is between 0 and 1. For example, there are two fields, respectively "monthly income after personal tax" and "personal income tax", then both s are calculated₁When sim is 3 ("person", "tax"), short is 5, and long is 7.

wherein s is₂Indicating the string length similarity measure, short indicating the string length that the shorter of the two fields (i.e., the target field and the any one of the other fields) has, long indicating the string length that the longer of the two fields has, e.g., s for calculating the fields "personal tax and monthly income" and "occupation₂To obtain

wherein s is₃And expressing the similarity index of the enumeration value numbers, wherein min expresses the enumeration value number of the field with less enumeration value number in the two fields, and max expresses the enumeration value number of the field with more enumeration value number in the two fields. For example, there are 6 enumerated values for the "professional" field (police, nurse, teacher, programmer, student, clerk) and 2 enumerated values for the "gender" field (male and female) in the field data sheet, both s₃Is composed of

wherein s is₄And indicating the enumeration value length similarity index, wherein avg _ min indicates the average length of the enumeration values of the fields with shorter average lengths of the enumeration values in the two fields, and avg _ max indicates the average length of the enumeration values of the fields with longer average lengths of the enumeration values in the two fields. For example, if the average length of the enumerated values of the "occupation" field is (2+2+2+3+2+ 2)/6-2.17 and the average length of the enumerated values of the "sex" field is (1+ 1)/2-1, then s of both is 1₄Is composed of

Specifically, the calculating the similarity between the target field and the any one of the other fields according to the similarity index between the target field and the any one of the other fields may include:

calculating an average value or a weighted average value of the character string similarity index, the character string length similarity index, the enumerated value number similarity index and the enumerated value length similarity index as the similarity between the target field and any one of the other fields, such as the similarity between two fields

105. And selecting the field with the highest similarity in the other fields, and replacing the target field in the first question sentence to obtain the recommended second question sentence.

After the similarity between the target field and each other field except the target field in the field data table is obtained through calculation, the field with the highest similarity in each other field is selected, and the target field in the first question sentence is replaced to obtain a recommended second question sentence. For example, if the first question sentence is "how the average income of different jobs in shanghai is distributed", where "job" is a target field, and the field with the highest similarity to "job" in the field data table is "industry", the "industry" may be used to replace the "job" in the first question sentence, so as to obtain a second question sentence: "how well the average income of different industries in Shanghai is distributed". And finally, recommending the second question sentence to the user, and completing the process of question recommendation once.

After extracting each field of the input question sentence, the embodiment of the application compares each field with fields in a field data table constructed in advance one by one, finds out the same field in the extracted field and the field data table, and determines the same field as a target field; then, the similarity between the target field and each other field in the field data table is respectively calculated, the field with the highest similarity is found out, and the target field in the question sentence is replaced, so that the recommended question sentence is obtained. Compared with a conventional mode of randomly replacing keywords of the same type in sentences, the method and the device for generating the new question sentences more consistent with the expectation of the user can generate the new question sentences by comprehensively considering the similarity among the preset fields and replacing the fields in the original question sentences with the fields with the highest similarity, and accuracy of the questions recommended by the intelligent question-answering system is improved.

Referring to fig. 2, a second embodiment of a method for recommending a question based on field similarity calculation according to the embodiment of the present application includes:

201. acquiring an input first question sentence;

202. performing word segmentation processing on the first question sentence, and extracting each field contained in the first question sentence;

203. comparing the fields with fields in a field data table which is constructed in advance one by one, finding out the fields which are the same as the fields in the field data table, and determining the fields as target fields;

the steps 201-203 are the same as the steps 101-103, and the related description of the steps 101-103 can be referred to.

204. Searching all historical question sentences of the user who inputs the first question sentence;

after determining the target field, the server may obtain the historical question record of the user who inputs the first question sentence, and find all the historical question sentences of the user.

205. Constructing a co-occurrence matrix according to the historical question sentences, wherein the co-occurrence matrix records the times of the common occurrence of any two fields in the field data table in the same historical question sentences of the user;

then, a co-occurrence matrix is constructed according to the historical question sentences, and the co-occurrence matrix records the times of the common occurrence of any two fields in the field data table in the same historical question sentence of the user. For example, a co-occurrence matrix M constructed from the user's historical question statements is:

the co-occurrence matrix M corresponds to table 2 below:

TABLE 2

Co-occurrence matrix M	Occupation of the world	Sex	Age (age)	Personal income after tax and month	Industry
						Occupation of the world	-	18	27	22	3
Sex	18	-	2	15	5
						Age (age)	27	2	-	30	10
Personal income after tax and month	22	15	30	-	21
						Industry	3	5	10	21	-

In table 2, the value corresponding to "gender" and "occupation" is 18, which indicates that the number of times that "gender" and "occupation" have been co-occurred in the same historical question sentence in all the historical question sentences of the user is 18. For example, all question sentences asked by the user, "relationship between different sex and occupation", "different occupation and sex unmarried proportion", …, "correlation between different sex and different occupation", and the like are stored in advance. In these questions, there are "occupation" and "gender", and if there are 18 such questions, the two are 18 times of co-occurrence.

206. Calculating the similarity between the target field and each other field except the target field in the field data table according to the co-occurrence matrix;

after the co-occurrence matrix is constructed, the similarity between the target field and each of the other fields in the field data table except the target field may be calculated according to the co-occurrence matrix.

Specifically, step 206 may include:

(1) extracting field vectors of the target fields and field vectors of each other field from the co-occurrence matrix respectively, wherein each element of the field vectors is the times of the common occurrence of the corresponding field and each field in the field data table in the same historical question sentence of the user;

(2) and respectively calculating cosine similarity between the field vector of the target field and the field vectors of each other field to obtain the similarity between the target field and each other field.

In the co-occurrence matrix, each field corresponds to a field vector, for example, the "professional" field vector is [0, 18, 27, 22, 3], the gender field vector is [18, 0, 2, 15, 5], that is, the row or column in which a field is located is taken from the co-occurrence matrix, which is the field vector of the field. After the field vectors are extracted, respectively calculating cosine similarity between the field vectors of the target field and the field vectors of each other field, namely obtaining the similarity between the target field and each other field. For example, if the target field is "professional", the similarity between the target field and some other field "gender" is equal to the cosine similarity of the vector [0, 18, 27, 22, 3] and the vector [18, 0, 2, 15, 5 ].

207. And selecting the field with the highest similarity in the other fields, and replacing the target field in the first question sentence to obtain the recommended second question sentence.

Step 207 is the same as step 105, and the related description of step 105 can be referred to.

After extracting each field of the input question sentence, the embodiment of the application compares each field with fields in a field data table constructed in advance one by one, finds out the same field in the extracted field and the field data table, and determines the same field as a target field; then, all historical question sentences input by the user are searched, a co-occurrence matrix is constructed, the similarity between the target field and each other field except the target field in the field data table is calculated according to the co-occurrence matrix, the field with the highest similarity is found, and the target field in the question sentences is replaced, so that the recommended question sentences are obtained. Compared with the first embodiment of the present application, this embodiment proposes a specific way of calculating the similarity between the target field and each of the other fields.

Referring to fig. 3, a third embodiment of a question recommendation method based on field similarity calculation in the embodiment of the present application includes:

301. acquiring an input first question sentence;

302. performing word segmentation processing on the first question sentence, and extracting each field contained in the first question sentence;

303. comparing the fields with fields in a field data table which is constructed in advance one by one, finding out the fields which are the same as the fields in the field data table, and determining the fields as target fields;

304. searching all historical question sentences of the user who inputs the first question sentence;

305. constructing a co-occurrence matrix according to the historical question sentences, wherein the co-occurrence matrix records the times of the common occurrence of any two fields in the field data table in the same historical question sentences of the user;

the steps 301-.

306. Determining a field with the most times which appears in the same historical question sentence of the user together with the target field in the field data table according to the co-occurrence matrix;

307. and selecting the field with the most times, and replacing the target field in the first question sentence to obtain a recommended third question sentence.

After the co-occurrence matrix is constructed, the field with the most times, which appears in the same historical question sentence of the user together with the target field, in the field data table can be determined according to the co-occurrence matrix, then the field with the most times is selected, and the target field in the first question sentence is replaced, so that the recommended third question sentence is obtained.

For example, if the first question sentence is "how the average income of different jobs in shanghai is distributed", where "job" is a target field, and the field having the largest number of co-occurrences with the field "job" in the co-occurrence matrix M is "age" (27 times), the "job" in the first question sentence may be replaced with "age", thereby obtaining a third question sentence: "how well the average income of Shanghai varies among ages".

After extracting each field of the input question sentence, the embodiment of the application compares each field with fields in a field data table constructed in advance one by one, finds out the same field in the extracted field and the field data table, and determines the same field as a target field; then, searching all historical question sentences input by the user and constructing a co-occurrence matrix; and determining a field with the most times which appears in the same historical question and sentence of the user together with the target field in the field data table according to the co-occurrence matrix, selecting the field with the most times, and replacing the target field in the first question and sentence to obtain a recommended third question and sentence. Compared with the second embodiment of the present application, this embodiment proposes a question sentence generation manner that also uses the co-occurrence matrix, but is different from the way of calculating the similarity between fields.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.

Corresponding to the problem recommendation method based on field similarity calculation described in the above embodiments, fig. 4 shows a block diagram of a problem recommendation device based on field similarity calculation provided in the embodiments of the present application, and for convenience of description, only the parts related to the embodiments of the present application are shown.

Referring to fig. 4, the apparatus includes:

a question acquiring module 401, configured to acquire an input first question sentence;

a word segmentation module 402, configured to perform word segmentation on the first question sentence, and extract each field included in the first question sentence;

a field comparison module 403, configured to compare the fields with fields in a field data table that is constructed in advance one by one, find out the same field that each field and the field data table have, and determine the field as a target field;

a field similarity calculation module 404, configured to calculate similarities between the target field and each of the other fields in the field data table except the target field;

and the question recommending module 405 is configured to select a field with the highest similarity from the other fields, and replace the target field in the first question sentence to obtain a recommended second question sentence.

Further, the field similarity calculation module may include:

a similarity index calculation unit, configured to calculate a similarity index between the target field and any one of the other fields by combining the character string and the enumerated value of the target field and the character string and the enumerated value of the any one of the other fields, where the similarity index is a parameter used for measuring a degree of similarity between the two fields;

and the first field similarity calculation unit is used for calculating and obtaining the similarity between the target field and any one of the other fields according to the similarity indexes of the target field and any one of the other fields.

Further, the similarity index calculation unit may specifically be configured to: calculating a character string similarity index, a character string length similarity index, an enumerated value number similarity index and an enumerated value length similarity index of the target field and any other field;

the first field similarity calculation unit may be specifically configured to: and calculating an average value or a weighted average value of the character string similarity index, the character string length similarity index, the enumerated value number similarity index and the enumerated value length similarity index as the similarity of the target field and any other field.

Further, the character string similarity index may be calculated by using the following formula:

wherein s is₁Representing the character string similarity index, sim representing the number of identical character strings of the two fields, short representing the length of the character string of the field with shorter length of the two fields, long representing the length of the character string of the field with longer length of the two fields, α being a hyper-parameter for controlling the influence of the character string on the similarity;

wherein s is₂Representing the string length similarity index, short representing the length of the string of the shorter of the two fieldsLong indicates the length of the character string of the longer of the two fields;

Further, the field similarity calculation module may include:

the historical sentence searching unit is used for searching all historical question sentences of the user who inputs the first question sentence;

a co-occurrence matrix construction unit, configured to construct a co-occurrence matrix according to the historical question statement, where the co-occurrence matrix records the number of times that any two fields in the field data table appear in the same historical question statement of the user;

and the second field similarity calculation unit is used for calculating the similarity between the target field and each other field according to the co-occurrence matrix.

Further, the second field similarity calculation unit may include:

a field vector extraction subunit, configured to extract, from the co-occurrence matrix, a field vector of the target field and a field vector of each of the other fields, where each element of the field vector is a number of times that a corresponding field and each field in the field data table appear in a same history question sentence of the user;

and the cosine similarity operator unit is used for respectively calculating cosine similarity between the field vector of the target field and the field vector of each other field to obtain the similarity between the target field and each other field.

Further, the field similarity calculation module may further include:

the field data table comprises a field data table, a field determination unit and a field generation unit, wherein the field data table comprises a field data table, a field generation unit and a field generation unit;

and the field replacing module is used for selecting the field with the most times, and replacing the target field in the first question sentence to obtain the recommended third question sentence.

An embodiment of the present application further provides a computer-readable storage medium, which stores computer-readable instructions, and when the computer-readable instructions are executed by a processor, the method implements the steps of any one of the problem recommendation methods based on field similarity calculation, as shown in fig. 1 to 3.

The embodiment of the present application further provides a server, which includes a memory, a processor, and computer readable instructions stored in the memory and executable on the processor, where the processor executes the computer readable instructions to implement any one of the steps of the problem recommendation method based on field similarity calculation, as shown in fig. 1 to 3.

The embodiment of the present application further provides a computer program product, which when running on a server, causes the server to execute the steps of implementing any one of the problem recommendation methods based on field similarity calculation as shown in fig. 1 to 3.

Fig. 5 is a schematic diagram of a server according to an embodiment of the present application. As shown in fig. 5, the server 5 of this embodiment includes: a processor 50, a memory 51, and computer readable instructions 52 stored in said memory 51 and executable on said processor 50. The processor 50, when executing the computer readable instructions 52, implements the steps in the above-described embodiments of the method for problem recommendation based on field similarity calculation, such as the steps 101 to 105 shown in fig. 1. Alternatively, the processor 50, when executing the computer readable instructions 52, implements the functions of the modules/units in the above-mentioned device embodiments, such as the functions of the modules 401 to 405 shown in fig. 4.

Illustratively, the computer readable instructions 52 may be partitioned into one or more modules/units that are stored in the memory 51 and executed by the processor 50 to accomplish the present application. The one or more modules/units may be a series of computer-readable instruction segments capable of performing specific functions, which are used to describe the execution of the computer-readable instructions 52 in the server 5.

The server 5 may be a computing device such as a smart phone, a notebook, a palm computer, and a cloud server. The server 5 may include, but is not limited to, a processor 50, a memory 51. Those skilled in the art will appreciate that fig. 5 is merely an example of a server 5 and does not constitute a limitation of the server 5 and may include more or fewer components than shown, or some components in combination, or different components, e.g., the server 5 may also include input output devices, network access devices, buses, etc.

The Processor 50 may be a CentraL Processing Unit (CPU), other general purpose Processor, a DigitaL SignaL Processor (DSP), an AppLication Specific Integrated Circuit (ASIC), an off-the-shelf ProgrammabLe Gate Array (FPGA) or other ProgrammabLe logic device, discrete Gate or transistor logic device, discrete hardware component, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The storage 51 may be an internal storage unit of the server 5, such as a hard disk or a memory of the server 5. The memory 51 may also be an external storage device of the server 5, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure DigitaL (SD) Card, a FLash memory Card (FLash Card), or the like, provided on the server 5. Further, the memory 51 may also include both an internal storage unit and an external storage device of the server 5. The memory 51 is used to store the computer readable instructions and other programs and data required by the server. The memory 51 may also be used to temporarily store data that has been output or is to be output.

It should be noted that, for the information interaction, execution process, and other contents between the above-mentioned devices/units, the specific functions and technical effects thereof are based on the same concept as those of the embodiment of the method of the present application, and specific reference may be made to the part of the embodiment of the method, which is not described herein again.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, all or part of the processes in the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium and can implement the steps of the embodiments of the methods described above when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include at least: any entity or device capable of carrying computer program code to a photographing apparatus/terminal apparatus, a recording medium, computer Memory, Read-Only Memory (ROM), random-access Memory (RAM), an electrical carrier signal, a telecommunications signal, and a software distribution medium. Such as a usb-disk, a removable hard disk, a magnetic or optical disk, etc.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims

1. A problem recommendation method based on field similarity calculation is characterized by comprising the following steps:

acquiring an input first question sentence;

2. The question recommendation method of claim 1 wherein the similarity between the target field and any one of the other fields in the field data table is calculated by:

3. The question recommendation method of claim 2, wherein said calculating a similarity measure of the target field and the any one of the other fields comprises:

the calculating the similarity between the target field and the any one other field according to the similarity index between the target field and the any one other field includes:

4. The question recommendation method of claim 3, wherein the string similarity index is calculated using the following formula:

the character string length similarity index is calculated by adopting the following formula:

the enumeration value number similarity index is calculated by adopting the following formula:

wherein s is₃Expressing the similarity index of the enumeration value number, min expressing the enumeration value number of the field with less enumeration value number in the two fields, max expressing the enumeration value number of the field with less enumeration value number in the two fieldsEnumerated value numbers of fields with more enumerated values;

the enumerated value length similarity index is calculated by adopting the following formula:

5. The question recommendation method of claim 1, wherein said separately calculating the similarity between the target field and each of the other fields in the field data table except the target field comprises:

6. The question recommendation method of claim 5, wherein said calculating a similarity between the target field and the respective other fields according to the co-occurrence matrix comprises:

7. The question recommendation method according to claim 5 or 6, after constructing a co-occurrence matrix from the historical question sentences, further comprising:

8. A question recommendation apparatus based on field similarity calculation, comprising:

9. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the question recommendation method as claimed in any one of claims 1 to 7.

10. A server comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the problem recommendation method according to any one of claims 1 to 7 when executing the computer program.