Disclosure of Invention
Therefore, an object of the present invention is to provide a topic pushing method to automatically push topic contents suitable for a local education and teaching test, so as to solve the problems of low efficiency and large workload of manual annotation.
A title pushing method comprises the following steps:
obtaining a region diffusion chain for pre-pushing topics, wherein the region diffusion chain comprises diffusion level integer values and similarity corresponding to each region in a topic source database;
calculating a vector correlation value between a target region and the region where the pre-pushed question is located according to the diffusion level integer value and the similarity;
calculating a correlation degree value of the pre-pushed question and the target region according to the vector correlation value and the distance between the target region and the region where the pre-pushed question is located;
if the correlation degree value is larger than the threshold value, adding the pre-pushed titles into the title pushing set of the target area, and pushing the titles in the title pushing set of the target area when receiving a title pushing request from the target area.
According to the title pushing method provided by the invention, firstly, a region diffusion chain for pre-pushing titles is obtained, the region diffusion chain comprises diffusion level integral values and similarities corresponding to all regions in a title source database, then, a vector association value between a target region and the region where the pre-pushing title is located is calculated according to the diffusion level integral values and the similarities, regional vector topology analysis of title contents is realized, then, a correlation degree value between the pre-pushing title and the target region is calculated according to the vector association value and the distance between the target region and the region where the pre-pushing title is located, applicability dynamic scores of each title in different regions can be formed, if the correlation degree value is greater than a threshold value, the pre-pushing title is added into a title pushing set of the target region for subsequent pushing, and automatic regional labeling effects are sequentially realized, the method has the advantages that the method does not need to be labeled, and can help teachers quickly screen out titles in proper locations.
In addition, according to the title push method of the present invention, the following additional technical features may be further provided:
further, the target region and the region where the pre-pushed topic is located are calculated according to the diffusion level integer value and the similarityIn the step of vector correlation value between, the vector correlation value is calculated according to the following formula
:
Wherein the content of the first and second substances,
question pre-pushed
TBelonging to a set of topics
A is the target area, X is the area where the pre-push topic is located,
Yis an integer value of the diffusion level or levels,
Sis prepared by reacting with
TThe similarity of (c).
Further, in the step of calculating the degree of correlation value between the pre-push topic and the target region according to the vector correlation value and the distance between the target region and the region where the pre-push topic is located, the degree of correlation value R (T, a) is calculated according to the following formula:
wherein the content of the first and second substances,
。
further, the threshold value is 0.5.
Further, the method further comprises:
calculating a second vector correlation value between the target region and each region except the target region in the topic source database;
adding the titles in the region corresponding to the second vector correlation value larger than the correlation threshold value into the title push set of the target region, and pushing the titles in the title push set of the target region when receiving a title push request from the target region.
Another objective of the present invention is to provide a topic push system to automatically push topic contents suitable for local education and teaching examinations, so as to solve the problems of low efficiency and heavy workload of manual annotation.
A topic push system, comprising:
the system comprises an acquisition module, a display module and a display module, wherein the acquisition module is used for acquiring a region diffusion chain for pre-pushing titles, and the region diffusion chain comprises diffusion level integer values and similarity corresponding to each region in a title source database;
the first calculation module is used for calculating a vector association value between a target region and a region where the pre-pushed question is located according to the diffusion level integer value and the similarity;
the second calculation module is used for calculating the degree of correlation value of the pre-pushed question and the target region according to the vector correlation value and the distance between the target region and the region where the pre-pushed question is located;
and the first pushing module is used for adding the pre-pushed questions into the question pushing set of the target area if the correlation degree value is greater than the threshold value, and pushing the questions in the question pushing set of the target area when receiving a question pushing request from the target area.
According to the title pushing system provided by the invention, firstly, a region diffusion chain for pre-pushing titles is obtained, the region diffusion chain comprises diffusion level integral values and similarities corresponding to all regions in a title source database, then, a vector association value between a target region and the region where the pre-pushing title is located is calculated according to the diffusion level integral values and the similarities, regional vector topology analysis of title contents is realized, then, a correlation degree value between the pre-pushing title and the target region is calculated according to the vector association value and the distance between the target region and the region where the pre-pushing title is located, applicability dynamic scores of each title in different regions can be formed, if the correlation degree value is greater than a threshold value, the pre-pushing title is added into a title pushing set of the target region for subsequent pushing, and automatic regional labeling effects are sequentially realized, the method has the advantages that the method does not need to be labeled, and can help teachers quickly screen out titles in proper locations.
In addition, the topic pushing system according to the present invention may further have the following additional technical features:
further, the first calculating module is configured to calculate the vector correlation value according to the following formula
:
Wherein the content of the first and second substances,
question pre-pushed
TBelonging to a set of topics
A is the target area, X is the area where the pre-push topic is located,
Yis an integer value of the diffusion level or levels,
Sis prepared by reacting with
TThe similarity of (c).
Further, the second calculating module is configured to calculate the correlation degree value R (T, a) according to the following formula:
wherein the content of the first and second substances,
。
further, the threshold value is 0.5.
Further, the system further comprises:
a third calculating module, configured to calculate a second vector correlation value between the target region and each region in the topic source database except the target region;
and the second pushing module is used for adding the titles in the region corresponding to the second vector correlation value larger than the correlation threshold value into the title pushing set of the target region, and pushing the titles in the title pushing set of the target region when receiving a title pushing request from the target region.
The invention also proposes a readable storable medium on which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method.
The invention also proposes a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the above method when executing the program.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, a title push method according to a first embodiment of the present invention includes steps S101 to S104.
S101, obtaining a region diffusion chain for pre-pushing topics, wherein the region diffusion chain comprises diffusion level integer values and similarity corresponding to each region in a topic source database.
Before acquiring the region diffusion chain, a topic source database needs to be established, which specifically includes establishing a national city/district/county and school information base thereof, establishing information tables and longitude and latitude information of 1800 cities/districts/counties across the country, and establishing an attribution relationship of 24 thousands of schools across the country. The internet provides a large amount of test paper contents through crawling the public test paper document information, and the main information of the contents comprises the test paper name and the test paper contents (usually word, pdf, picture is a carrier). These contents can be automatically crawled and saved through crawler technology. In addition, the information of the teacher organizing the homework/examination questions in the question bank itself is also recorded, and the recorded information includes key information such as the school where the teacher is located, the relevant city/district/county regional information, the question content, and the question organizing time.
By the means, a constantly updated topic source database can be established, and the key information of the database comprises: (1) topic content, (2) topic sources (e.g., the XX middle school monthly entrance in the XX city of 2019-2020, or the XX middle school mathematics assignment in the XX city of 2019-12-30), and (3) subject.
As the originality proportion of most school homework and examination questions is generally lower, the originality of the questions of national famous schools (generally referred to as 100-strength schools) is higher, taking physics as an example, the originality questions of the national famous schools account for 50 percent in the test paper, the originality questions of general schools are almost 0, and partial content editing is usually carried out on the existing questions. Finally, a region diffusion chain of the topics can be obtained through the comparison of the similarity of the contents of the topics.
It should be noted that the method of this embodiment is for a certain subject, and for a certain subject (e.g. physics), its topic set is
To a certain subject
T x In other words, the region spreading chain is:
whereinL x Is a region;Y x is an integer value of a diffusion level (as shown in fig. 2, the balance water of Hebei river is 0, the Hebei stone house and the Hebei corridor are 1, the Hebei Zhengding, the Shandong Ziziqiang, the Jiangxi Nanchang are 2, the Shandong Qingdao is 3, and the Shandong Weifang is 4);S x subject and of corresponding regionT x Is a value between 0 and 1.
And S102, calculating a vector association value between a target region and the region where the pre-pushed title is located according to the diffusion level integer value and the similarity.
Wherein, given two areas A and B, the vector correlation value between the areas A and B can be calculated through the topic area diffusion chain
And
. It should be noted that the association between a and B is unidirectional, for example, the title of the north river water balance is applicable to the north river corridor, but does not mean that the title of the north river corridor is applicable to the north river water balance.
Wherein the content of the first and second substances,
. In that
The number of occurrences of region A means that if a topic is derived from A, it is counted once.
Through the above-mentioned calculations, it is possible to,
the larger the value of (A), the more suitable the topic of region B is for region A. On the contrary, the method can be used for carrying out the following steps,
when the value of (A) is 0 or negative, it indicates that the title of the region B is not applicable to the region A.
Based on the above, the vector association value between the target region and the region where the pre-pushed topic is located can be calculated according to the diffusion level integer value and the similarity
:
Wherein the content of the first and second substances,
question pre-pushed
TBelonging to a set of topics
A is the target area, X is the area where the pre-push topic is located,
Yis an integer value of the diffusion level or levels,
Sis prepared by reacting with
TThe similarity of (c).
S103, calculating a correlation degree value of the pre-pushed topic and the target region according to the vector correlation value and the distance between the target region and the region where the pre-pushed topic is located.
Wherein, the correlation degree value R (T, A) is calculated according to the following formula:
wherein the content of the first and second substances,
the distance between the region X and the region A is the distance between the two regions pre-stored in the computerAnd means the distance between two regions, which may be in kilometers. The distance is determined according to the actual distance between two geographical regions, and is a defined value, for example, the distance between the two regions of Hebei Shizhu can be defined as 150km, it should be noted that in the above formula, the value of the distance between the two regions is simply taken, i.e. 150 is taken, and is not counted in units, and 10 in the formula is an empirical value.
S104, if the correlation degree value is larger than the threshold value, adding the pre-pushed titles into the title pushing set of the target area, and pushing the titles in the title pushing set of the target area when receiving a title pushing request from the target area.
In this embodiment, the threshold value is 0.5 according to the empirical value, and if the calculated correlation degree value is greater than 0.5, it indicates that the pre-push topic is suitable for the target area a, the pre-push topic is added to the topic push set of the target area, and the topics suitable for the target area can be gradually accumulated through continuous update. When a topic pushing request from a target area is received, topics in a topic pushing set of the target area are pushed, and the pre-pushed topics can be pushed to a user because the pre-pushed topics are in the topic pushing set of the target area. In specific implementation, a user can select to push all the titles in the title push set of the target area, or can push only a preset number of titles.
It can be understood that if the calculated correlation degree value is less than or equal to 0.5, it indicates that the pre-push topic is not suitable for the target area a, the pre-push topic is not added to the topic push set of the target area, and when a topic push request from the target area is received, the pre-push topic is not pushed.
Further, as a specific example, the method further includes:
calculating a second vector correlation value between the target region and each region except the target region in the topic source database;
adding the titles in the region corresponding to the second vector correlation value larger than the correlation threshold value into the title push set of the target region, and pushing the titles in the title push set of the target region when receiving a title push request from the target region.
If, for the set of topics
The locale in the source database for the topic has a locale B, C, D, E in addition to the target locale a. Thus, according to the formula in step S102, the second vector associated values can be calculated respectively
The correlation threshold is, for example, 0 if
Are all greater than 0, and
both are smaller than 0, which indicates that the topics in the regions C and D are also suitable for the target region A, the topics in the regions C and D are added into the topic push set of the target region A, that is, the topics are added into the subset
Subset of
The h-channel questions are labeled with the target area A, so that the question push set of the target area A can be further expanded.
According to the title pushing method provided by the embodiment, a region diffusion chain for pre-pushing titles is firstly obtained, the region diffusion chain comprises diffusion level integer values and similarities corresponding to all regions in a title source database, then a vector association value between a target region and the region where the pre-pushing title is located is calculated according to the diffusion level integer values and the similarities, regional vector topology analysis of title contents is realized, then a correlation degree value between the pre-pushing title and the target region is calculated according to the vector association value and the distance between the target region and the region where the pre-pushing title is located, applicability dynamic scores of each title in different regions can be formed, if the correlation degree value is greater than a threshold value, the pre-pushing title is added into a title pushing set of the target region for subsequent pushing, and automatic regional labeling effects are sequentially realized, the method has the advantages that the method does not need to be labeled, and can help teachers quickly screen out titles in proper locations.
Referring to fig. 3, based on the same inventive concept, a topic push system according to a second embodiment of the present invention includes:
the system comprises an acquisition module, a display module and a display module, wherein the acquisition module is used for acquiring a region diffusion chain for pre-pushing titles, and the region diffusion chain comprises diffusion level integer values and similarity corresponding to each region in a title source database;
the first calculation module is used for calculating a vector association value between a target region and a region where the pre-pushed question is located according to the diffusion level integer value and the similarity;
the second calculation module is used for calculating the degree of correlation value of the pre-pushed question and the target region according to the vector correlation value and the distance between the target region and the region where the pre-pushed question is located;
and the first pushing module is used for adding the pre-pushed questions into the question pushing set of the target area if the correlation degree value is greater than the threshold value, and pushing the questions in the question pushing set of the target area when receiving a question pushing request from the target area.
In this embodiment, the first calculating module is configured to calculate the vector correlation value according to the following formula
:
Wherein,
Question pre-pushed
TBelonging to a set of topics
A is the target area, X is the area where the pre-push topic is located,
Yis an integer value of the diffusion level or levels,
Sis prepared by reacting with
TThe similarity of (c).
Further, the second calculating module is configured to calculate the correlation degree value R (T, a) according to the following formula:
wherein the content of the first and second substances,
。
further, the threshold value is 0.5.
In this embodiment, the system further includes:
a third calculating module, configured to calculate a second vector correlation value between the target region and each region in the topic source database except the target region;
and the second pushing module is used for adding the titles in the region corresponding to the second vector correlation value larger than the correlation threshold value into the title pushing set of the target region, and pushing the titles in the title pushing set of the target region when receiving a title pushing request from the target region.
According to the topic push system provided by this embodiment, a region diffusion chain for pre-pushing topics is first obtained, where the region diffusion chain includes diffusion level integer values and similarities corresponding to regions in a topic source database, and then a vector association value between a target region and a region where the pre-pushing topic is located is calculated according to the diffusion level integer values and the similarities, so as to implement regional vector topology analysis of topic contents, and then a correlation degree value between the pre-pushing topic and the target region is calculated according to the vector association value and a distance between the target region and the region where the pre-pushing topic is located, so as to form dynamic applicability scores of each topic in different regions, and if the correlation degree value is greater than a threshold, the pre-pushing topic is added to a topic push set in the target region for subsequent push, so as to sequentially implement an automatic regional labeling effect, the method has the advantages that the method does not need to be labeled, and can help teachers quickly screen out titles in proper locations.
Furthermore, an embodiment of the present invention also proposes a readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps of the method described in the first embodiment.
Furthermore, an embodiment of the present invention also provides a computer device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and the processor implements the steps of the method in the first embodiment when executing the program.
The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit of a logic gate circuit specifically used for realizing a logic function for a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.