CN110263266A - A kind of method for exhibiting data based on wechat small routine and crawler - Google Patents

A kind of method for exhibiting data based on wechat small routine and crawler Download PDF

Info

Publication number
CN110263266A
CN110263266A CN201910417546.XA CN201910417546A CN110263266A CN 110263266 A CN110263266 A CN 110263266A CN 201910417546 A CN201910417546 A CN 201910417546A CN 110263266 A CN110263266 A CN 110263266A
Authority
CN
China
Prior art keywords
data
file
module
function
page
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910417546.XA
Other languages
Chinese (zh)
Inventor
韩飞
方升
凌万云
凌青华
瞿刘辰
宋余庆
周从华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu University
Original Assignee
Jiangsu University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu University filed Critical Jiangsu University
Priority to CN201910417546.XA priority Critical patent/CN110263266A/en
Publication of CN110263266A publication Critical patent/CN110263266A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of method for exhibiting data based on wechat small routine and crawler, each website is carried out by crawler frame Scrapy to crawl data, by the data village crawled storage into the MongoDB database on Cloud Server, back-end code is write to obtain the data crawled by the Django frame of Python, and interface is provided and is shown to wechat small routine to obtain data.The present invention takes full advantage of being not necessarily to installation, saving the advantages of memory for wechat small routine, crawls data with crawler technology, solves the problems, such as that information could be browsed by installing App in the past.

Description

A kind of method for exhibiting data based on wechat small routine and crawler
Technical field
It is specifically a kind of to be shown based on the data of wechat small routine and crawler the invention belongs to computer application technology Method.
Background technique
Django is the Web application framework an of open source code, is write as by Python.Using the framework mode of MVC, That is model M, view V and controller C.The main purpose of Django is easy, quickly exploitation database-driven website.It is strong Code reuse is adjusted, multiple components can very easily serve entire frame in the form of " plug-in unit ", and there are many function is strong by Django Big third side plug, you even can very easily develop the kit of oneself.This make Django have it is very strong can Scalability.It also emphasizes quickly exploitation and DRY (Do Not Repeat Yourself) principle.
MongoDB is the database based on distributed document storage, is write by C Plus Plus, it is intended to mention for WEB application For expansible high-performance data storage solution.MongoDB be one between relational database and non-relational database it Between product, be that function is most abundant in non-relational database, be most like relational database.The data structure that it is supported is very loose It dissipates, is the bson format of similar json, therefore can store more complicated data type.The feature of Mongo maximum is that it is propped up The query language held is very powerful, and grammer is somewhat similarly to the query language of object-oriented, and similarity relation almost may be implemented Most functions of database list table inquiry, but also support to establish data and index.Its feature is high-performance, easy portion Administration easily uses, and storing data is very convenient;Main functional characteristics are stored towards set, the data of easy storage object type.
Wechat small routine, abbreviation small routine, English name Mini Program, it is i.e. usable to be that one kind does not need downloading installation Application, it realizes the dream of application " within reach ", and user, which sweeps, sweeps or search i.e. openable application.All-round opening Shen Please after, type of subject be enterprise, government, media, its hetero-organization or individual developer, can apply for the registration of small routine, little Cheng Sequence, subscription number, service number, enterprise number are parallel systems.On January 9th, 2017, Zhang little Long is on 2017 wechat open class Pro The small routine of publication is formally online.Small routine is a kind of very high without downloading the application that can be used and a threshold Innovation has constructed new small routine exploitation environment and developer's ecology by development in nearly 2 years.Small routine is also this An innovation achievement for really being able to influence ordinary programmers in China's IT industry for many years, small routine is also in many cities It realizes and supports subway, bus service.
Exactly small routine has the advantages that as above, so designing a kind of data displaying side based on wechat small routine and crawler Method.Vuejs is a set of for constructing the gradual frame of user interface.Unlike other large-scale frames, Vuejs is designed For can be with bottom-up layer-by-layer application.The core library of Vuejs only focuses on view layer, is not only easy to hand, is also convenient for and third party Library or existing Item increasing.On the other hand, when being used in combination with the tools chain of modernization and various support class libraries, Vuejs Also it is fully able to provide driving for complicated single page application.If you learn Html, js, css is also that can write small routine generation Code, only Vuejs is more conducive to write applet code, it is easier to upper hand.
Summary of the invention
The characteristics of for small routine, comprehensive displaying of various information and the advantage of Django frame, the invention proposes A kind of method for exhibiting data based on wechat small routine and crawler.
A kind of method for exhibiting data based on wechat small routine and crawler, comprising the following steps:
Step 1: the page is write first, write using wechat developer tool the page code of module, A project is first created in this tool, the project of creation carries two file pages and utils and four file, four texts Part is app.js, app.json, app.wxss and project.config.json respectively;Each file and each file have Different functions, each file represents each module in pages file, i.e. module file presss from both sides, each module file All there are four files for folder, are js, json, wxml and wxss file respectively, and wherein js file is to write js code, handle the dynamic of the page State effect and data acquisition, json file are about this module configuration information, and wxml is the page framework code of this module, finally Wxss file is page rendering code, and four files form a module;
Step 2: completing writing for the page of step 1, need further exist for dynamically showing the data of the page;Soft In part PyCharm, by inputting scrapy startproject projectName order in the console, one is created The crawler project of Scrapy frame, projectName are entry names, then into projectName catalogue, then are being controlled Scrapy genspider scrapyName URL order is inputted in platform, creates a crawler file, scrapyName is crawler Filename, URL are the addresses of crawler website, input scrapy crawl scrapyName to run project;Movie module, IT News template, cross-talk module data be all to be crawled from three different web sites, the crawler of each module will create one Crawler file crawls different data, the data crawled is stored onto the mongoDB database on Cloud Server, in crawler In file, attribute name, allowed_domains and start_urls are the title of the crawler, domain name title and target respectively The URL of website, function parse are the logical codes for handling data, and obtained page code is obtained number of targets by xpath According to, then it is packaged into object, the object of encapsulation is transmitted to by pipelines.py file by yield scrapy.Request code, In the process_item function of this document, parameter item is exactly packaged object;
It next is exactly that data are transmitted on the mongoDB database on Cloud Server, which is non-relational number According to library, the field of every data can be different field, suitable for storage crawler data;After data obtain, need micro- Data are obtained on letter small routine, that is, need interface (API), by calling interface by mongoDB database on Cloud Server Data acquisition to wechat small routine is returned to, the data of return are shown the page in wechat small routine by wechat small routine.
It further, include that entitled welcome file is created in pages file in step 1, which is authorization Login module, be respectively created in welcome file welcome.js, welcome.json, welcome.wxss and Tetra- files of welcome.wxml, each file have different functions;' authorization is stepped on for writing in welcome.wxml file The page code of record ' module, mainly using label is<view>,<image>,<button>with<text>,<view>label is A page area is created, fills content in this region,<div>label in similar Html,<image>label is to be used for Exhibiting pictures,<button>label is used to create ' authorization logs in ' button,<text>label is to show text information; Welcome.js file is the code about trigger event, when click ' authorization logs in ' button, can be executed BindGetUserInfo function records the information of registrant in the function, and by page jump to text and word modules, shows Before page text and word modules, the onLoad method in the post.js file of posts module, the meeting in this function can be executed The data record to be shown is loaded from this ground, every data record includes a picture information and text information, Welcome.json mainly registers the information of the module, the background color provided with navigation bar, and welcome.wxss file is The page is rendered, the page is made to seem neatly to be aligned, the code of the inside is exactly the css code in Html.
It further, include that the file of entitled movies, i.e. movie module are created in pages file in step 1, This module is the displaying about cinematic data;In movies file create four files, be respectively movies.js, Movies.json, movies.wxss and movies.wxml click ' film ' button in the navigation menu of bottom, can enter electricity The homepage of shadow module shows the film of three kinds of classification, that is, the film shown, the film that will be shown and bean cotyledon Top250 Film first shows three films of each classification in homepage, and can click subsequent ' more ' buttons of each classification can look into See all films of the category;Region by clicking film can check the details of portion's film, including film Show time, presentation locations, director names and picture, actor name and picture, synopsis, comment number.
It further, include that the file of entitled itnews, i.e. IT news template are created in pages file in step 1, This module is the displaying about IT news data, in itnews file create four files, be respectively itnews.js, Itnews.json, itnews.wxss and itnews.wxml click ' IT ' button in the navigation menu of bottom, and it is new to enter IT The homepage of module is heard, ' IT ' button is clicked, can enter itnews.js's according to the tabBar registered in app.json In onLoad function, go to obtain data by url, the data that will acquire are transmitted to itnews.wxml and are shown;In the module In homepage, head has used<scroll-view>component to roll IT news picture, and every row below represents a data, often Data includes news picture, headline and news keyword and issuing time, and left-right layout is employed herein;Click every Record, can trigger the toDetail function in js, can obtain the details of this record, and details include when delivering Between, source, title, news detailed description.
Further, include creating the file of entitled joke in pages file in step 1, be cross-talk module, this Module is the displaying about cross-talk data, in joke file create four files, be respectively joke.js, joke.json, Joke.wxss and joke.wxml clicks ' cross-talk ' button in the navigation menu of bottom, the homepage of meeting approach section submodule, at this In homepage, cross-talk data are slid vertically using component<scroll-view>, every cross-talk data include the picture of utterer, User name, cross-talk information, laughable number and comment number, every cross-talk is all the meeting when scroll bar is rolled into low side with the same layout Load data are shown again.
Further, include: in step 2 when click bottom navigation menu in ' film ' button when, first can basis " pagePath ": " pages/movies/movies " of tabBar enters in movie module in app.json, can execute first Movies.js file, into onLoad function, this function initialization function, for loading data;In this function, The data on Cloud Server in mongoDB database are obtained by interface remote, and this interface is by Python Django frame is write, and the path path in the urls.py file of Django frame, a total of six interface, three interfaces are passed through For obtaining film information, i.e. movie name, movie picture and scoring, other three interfaces are the details for obtaining film;When When calling different interfaces, the distinct methods of views can be executed, obtain different classes of cinematic data, and return data to Data after receiving data in onLoad function, are transmitted to the movies.wxml page and are shown by onLoad function.
Further, include: in step 2 when click bottom navigation menu in ' IT ' button when, first can basis " pagePath ": " pages/itnews/itnews " of tabBar enters in IT news template in app.json, can execute first Itnews.js file, into onLoad function, this function initialization function, for loading data;In this function, The data on Cloud Server in mongoDB database are obtained by interface remote, pass through the urls.py file of Django frame In the path path, two interfaces in total, one of interface is to obtain IT news information, i.e. headline, news picture, new It hears keyword and delivers the time, and another interface is the details for obtaining IT news;When calling different interfaces, can hold The distinct methods of row views obtain different news datas, and return data to onLoad function, in onLoad function After receiving data, data are transmitted to the itnews.wxml page and are shown.
Further, include: in step 2 when click bottom navigation menu in ' cross-talk ' button when, first can basis " pagePath ": " pages/joke/joke " of tabBar in app.json in approach section submodule, can be executed first Joke.js file, into onLoad function, this function initialization function, for loading data.In this function, The data on Cloud Server in mongoDB database are obtained by interface remote, pass through the urls.py file of Django frame In the path path, i.e. path (" get_duanzi/ ", views.getDuanZi), obtain cross-talk data: user's head portrait, user Name, cross-talk content, laughable number and comment number, when calling the interface, can execute the getDuanZi method of views, obtain cross-talk number According to, and onLoad function is returned data to, after receiving data in onLoad function, data are transmitted to joke.wxml pages Face is shown.
Insight of the invention is that first constructing the page using Vuejs, including css, js, html write, there are four main in total Module, respectively text and word modules, movie module, IT news template, cross-talk module;The data of text and word modules are storages at present It is to read data from local file to be shown when running small routine in local;The data of movie module are to pass through use The scrapy frame of Python is write crawlers and is crawled, the MongoDB non-relational database being stored on Cloud Server In, then an interface (url) write by the Django frame of Python, pass through url in small routine and obtains data, these data It is the json formatted data obtained by processing, wherein the small routine page is shown respectively is showing, will show and bean cotyledon The film of Top250;The page of IT news template includes that carousel figure and picture illustrate news information mode jointly with text to open up Show, and data therein are also to be obtained by crawler, then obtain json data by url;Equally, the data of cross-talk module It is also to be obtained by crawlers, then json data are obtained by url, data include word content, the pet name of utterer, figure Piece, laughable number and comment number.
According to above-mentioned design, realize that the technical solution of the invention mainly there are following two points:
(1) write to the page of aforementioned four module: the page mainly uses Vuejs technology, including js, css, The writing of html code;Each functional module is made of tetra- master files of js, wxss, wxml and json respectively, and js code is write In the file using .js as suffix, css code writes on using .wxss as in the file of suffix, html code is write on is with .wxml In the file of suffix, and configuration information is write on using .json as in the file of suffix;In different functional modules, some pages Face may be it is identical, need to go out identical code, reduce the writing of code, improve efficiency.
(2) obtain to the data of aforementioned four module: for application program, data are critically important.The hair The acquisition that bright innovative point is exactly the origin of data, saves and pass through Django frame.The data of the invention are to pass through crawler What program obtained.It first passes through Scrapy frame and writes crawler code, crawl the difference of website, write code and be also different, root It is write according to the page structure of website, and Scrapy frame has very big advantage in terms of obtaining page data and storing data, Data are to be deposited into non-relational database with json format, then write connecing for acquisition data by the Django frame of Python Mouthful, that is, url, thus small routine can obtain json formatted data by url, and small routine obtains 10 records every time, leads to It crosses dynamically load and obtains 10 records again.Until having obtained, the purpose for the arrangement is that the load of server is reduced, also for reduction Small routine obtains the burden of mass data, combines easy quick, small routine ' within reach ' of Django frame well And Scrapy frame asynchronous downloading the characteristics of.
What it is in above two contents is with the use of lower generated main beneficial effect:
(1) crawlers crawl data are stored into database, and the Code obtaining write by Django frame is to counting According to;
(2) Vuejs obtains data by interface, is shown by small routine, realizes basic function.
Detailed description of the invention
Fig. 1 is the functional block diagram of method for exhibiting data of one of the present invention based on wechat small routine and crawler;
Fig. 2 is the page screenshot that authorization logs in;
Fig. 3 (a) and Fig. 3 (b) is the page screenshot of text with word modules;
Fig. 4 (a), Fig. 4 (b) and Fig. 4 (c) are the page screenshots of movie module;
Fig. 5 (a) and Fig. 5 (b) is the page screenshot of IT news;
Fig. 6 is the page screenshot of cross-talk information;
Specific embodiment
A specific embodiment of the invention is described with reference to the accompanying drawing, so that those skilled in the art is better Understand the present invention.Requiring particular attention is that in the following description, it is known that perhaps function and the detailed description of design can When desalinating main contents of the invention, these descriptions will be ignored herein.
1, the method for the present invention includes the following steps realizes:
Step 1: first having to write the page of four modules, carry out writing module using wechat developer tool Page code, first creates a project in this tool, and the project of creation carries two files (pages and utils) and four A file, four are app.js, app.json, app.wxss and project.config.json respectively.Each file and every A file has different functions, and each file represents each module (module file folder), Mei Gemo in pages file All there are four files for block file folder, are js, json, wxml and wxss file respectively, and wherein js file is to write js code, handle page The dynamic effect and data acquisition in face, json file are about this module configuration information, and wxml is the page framework generation of this module Code, last wxss file are page rendering codes, and four files form a module, indispensable.
Step 2: after the writing of four module pages for completing step 1, next needing to carry out dynamic to the data of the page Displaying.In software PyCharm, by inputting scrapy startproject projectName order in the console, The crawler project of a Scrapy frame is created, projectName is entry name, then into projectName catalogue, It inputs scrapy genspider scrapyName URL order in the console again, creates a crawler file, ScrapyName is crawler filename, and URL is the address of crawler website, inputs scrapy crawl scrapyName to run Project.Movie module, IT news template, cross-talk module data be all to be crawled from three different web sites, each module Crawler will create a crawler file, crawl different data, by the data crawled storage to the mongoDB on Cloud Server On database.In crawler file, attribute name, allowed_domains and start_urls be respectively the crawler title, The URL of domain name title and targeted website, function parse are the logical codes for handling data, and obtained page code is passed through Xpath obtains target data, then is packaged into object, is transmitted to the object of encapsulation by yield scrapy.Request code Pipelines.py file, in the process_item function of this document, parameter item is exactly packaged object, next Exactly data are transmitted on the mongoDB database on Cloud Server, which is non-relational database, every data Field can be different field, and field can also be more one one less, and flexibility is very big, suitable for storage crawler data. After data obtain, needs to obtain data on wechat small routine, that is, need interface (API), taken cloud by calling interface The data acquisition of mongoDB database shows the data of return to wechat small routine, wechat small routine is returned on business device The page of wechat small routine, this is the entire substantially process for obtaining acquisition, data storage and data and showing.
2, a kind of method for exhibiting data based on wechat small routine and crawler of the present invention includes in above-mentioned steps 1 The following steps:
Step 1.1: in the project that step 1 creates, (the authorization of entitled welcome file is created in pages file Login module name), this module is logged in about authorization, and effect picture is as shown in Figure 2.It is created respectively in welcome file Tetra- files of welcome.js, welcome.json, welcome.wxss and welcome.wxml are built, each file is different Function.The page code that ' authorization log in ' module is write in welcome.wxml file, mainly use label be<view>, <image>,<button>with<text>,<view>label is one page area of creation, fills content, class in this region Like in Html<div>label,<image>label is to be used for exhibiting pictures,<button>label is for creating ' authorization logs in ' Button,<text>label are to show text information.Welcome.js file is the code about trigger event, when click ' is awarded When power login ' button, bindGetUserInfo function can be executed, the information of registrant is recorded in the function, and the page is jumped Text and word modules are gone to, displayed page text can execute the onLoad in the post.js file of posts module with before word modules Method can load the data record to be shown in this function from this ground, and every data record includes a picture information And text information.Welcome.json is mainly the information for registering the module, the background color provided with navigation bar. Welcome.wxss file is rendered to the page, makes the page seem neatly to be aligned, the code of the inside is exactly in Html Css code.
Step 1.2: in the project that step 1 creates, the file (electricity of entitled movies is created in pages file Shadow module name), this module is the displaying about cinematic data.Four files are created in movies file, are respectively The function of movies.js, movies.json, movies.wxss and movies.wxml, this four files have been introduced in step 1. ' film ' button in the navigation menu of bottom is clicked, the homepage of movie module can be entered, show the film of three kinds of classification, i.e., just The surface plot of film is illustrated as shown in Fig. 4 (a) in the film shown, the film that will be shown and bean cotyledon Top250 film Piece, movie name and film scoring.Three films that each classification is first shown in homepage, it is subsequent can to click each classification ' more ' button can check all films of the category, and as shown in Fig. 4 (b), the region by clicking film can be looked into The details for seeing portion's film, show time, presentation locations including film, director names and picture, actor name and figure Piece, synopsis, comment number, as shown in Fig. 4 (c).
Step 1.3: in the project that step 1 creates, the file of entitled itnews is created in pages file, and (IT is new Hear module), this module is the displaying about IT news data, and four files are created in itnews file, are respectively The function of itnews.js, itnews.json, itnews.wxss and itnews.wxml, this four files have been situated between in step 1 It continues.' IT ' button in the navigation menu of bottom is clicked, the homepage of IT news template can be entered, clicks ' IT ' button, meeting basis exists The tabBar registered in app.json is entered in the onLoad function of itnews.js, is gone to obtain data by url, be will acquire Data be transmitted to itnews.wxml and be shown, as shown in Fig. 5 (a).In the homepage of the module, head used < Scroll-view > component rolls IT news picture, and following every row represents a data, every data include news picture, Headline and news keyword and issuing time, are employed herein left-right layout.Every record is clicked, can be triggered in js ToDetail function can obtain the details of this record, and as shown in Fig. 5 (b), details include delivering the time, coming Source, title, news detailed description.
Step 1.4: in the project that step 1 creates, file (the cross-talk mould of entitled joke is created in pages file Block), this module is the displaying about cross-talk data, in joke file create four files, be respectively joke.js, The function of joke.json, joke.wxss and joke.wxml, this four files have been introduced in step 1.Click bottom navigation dish ' cross-talk ' button in list, the homepage of meeting approach section submodule, as shown in Figure 6.In the homepage, component < scroll- is used View > slide vertically cross-talk data, every cross-talk data include the picture of utterer, user name, cross-talk information, laughable number and Number is commented on, and every cross-talk is all data to be loaded again when scroll bar is rolled into low side and be shown with the same layout.
3, a kind of method for exhibiting data based on wechat small routine and crawler of the present invention includes in above-mentioned steps 2 The following steps:
Step 2.1: in text with word modules, data and pictorial information here are load local datas, not over Interface (API) obtains data, and such purpose is that access speed is fast, can rapidly return data to the page.
Step 2.2:, first can be according to tabBar in app.json when clicking ' film ' button in the navigation menu of bottom " pagePath ": " pages/movies/movies " enter movie module in, movies.js file can be executed first, enter In onLoad function, this function initialization function, for loading data.In this function, obtained by interface remote The data on Cloud Server in mongoDB database are taken, and this interface is write by the Django frame in Python, leads to Cross the path path in the urls.py file of Django frame, i.e. path (" get_movieTop250/ ", views.getMovieTop250),path("get_movieTop250_detail/", views.getMovieTop250Detail),path("get_movie_now/",views.getMovieNow),path(" get_movieNnow_detail/",views.getMovieNow Detail),path("get_movie_coming/", views.getMovieComing),path("get_movieComing_detail/", Views.getMovieComingDetail), a total of six interface, wherein path (" get_movieTop250/ ", Views.getMovie Top250), path (" get_movie_now/ ", views.getMovieNow) and path (" get_ Movie_coming/ ", views.get MovieComing) three interfaces be obtain film information (movie name, movie picture and Scoring), and other three interfaces are the details for obtaining film.When calling different interfaces, the difference of views can be executed Method obtains different classes of cinematic data, and returns data to onLoad function, receives data in onLoad function Afterwards, data the movies.wxml page is transmitted to be shown.
Step 2.3:, first can be according to tabBar in app.json when clicking ' IT ' button in the navigation menu of bottom " pagePath ": " pages/itnews/itnews " enter IT news template in, itnews.js file can be executed first, into Enter in onLoad function, this function initialization function, for loading data.In this function, by interface remote come The data on Cloud Server in mongoDB database are obtained, by the path path in the urls.py file of Django frame, I.e. path (" get_sinanews/ ", views.getSinaNews), path (" get_sinanews_detail/ ", Views.getSinaNews Detail), two interfaces in total, wherein path (" get_sinanews/ ", Views.getSinaNews) interface is to obtain IT news information (headline, news picture, news keyword and when delivering Between), and path (" get_sinanews_detail/ ", views.get SinaNewsDetail) interface is to obtain IT news Details.When calling different interfaces, the distinct methods of views can be executed, obtain different news datas, and by data OnLoad function is returned to, after receiving data in onLoad function, data is transmitted to the itnews.wxml page and are opened up Show.
Step 2.4:, first can be according to tabBar in app.json when clicking ' cross-talk ' button in the navigation menu of bottom " pagePath ": " pages/joke/joke ", in approach section submodule, joke.js file can be executed first, entered In onLoad function, this function initialization function, for loading data.In this function, obtained by interface remote The data on Cloud Server in mongoDB database are taken, by the path path in the urls.py file of Django frame, i.e., Path (" get_duanzi/ ", views.getDuanZi), be obtain cross-talk data (user's head portrait, user name, cross-talk content, Laughable number and comment number), when calling the interface, the getDuanZi method of views can be executed, obtain cross-talk data, and by data OnLoad function is returned to, after receiving data in onLoad function, data is transmitted to the joke.wxml page and are shown.
4, as shown in Figure 1, in a kind of method for exhibiting data based on wechat small routine and crawler of the invention, four modules Implementation steps it is as follows:
S1: authorization logs in
Into small routine, occurs authorization interface first, as shown in Fig. 2, if you click the button of ' authorization logs in ', little Cheng Sequence can get your pet name and head portrait etc., and can enter the main interface of small routine, that is, text and the word modules page, such as scheme Shown in 3 (a);If you select to exit, upper right corner circle button can be clicked, small routine is exited.
S2: text and word modules
After authorization logs in, it is first into text and the word modules page, this page is the letter delivered by carousel figure and user Breath composition, the data of this page are read from local file, and carousel figure has used swiper component, wherein every figure Piece is a swipr-item, and the time interval of picture switching is arranged by interval, and unit is millisecond;Under carousel figure Face is the information that user delivers, and head portrait including user delivers the time, title, detailed content, praises number and viewing number, is clicked Each information may browse through details, and can collect and share the information, click the music small icon on picture, Music can be played, as shown in Fig. 3 (b).
S3: film information is checked
Click the film in the navigation menu of bottom, it may appear that as shown in Fig. 4 (a), the page is by search box, will show electricity Shadow is showing four part compositions of film and bean cotyledon top250 film, these data are obtained from server by url It gets.First three film information of three classes film is shown first, and information includes film poster, movie name and scoring.It clicks Each movie link can show the details of film, including film types, show time, director, actor or actress and synopsis Deng as shown in Fig. 4 (b);The movie name to be searched is inputted in search box, can carry out fuzzy search.For every a kind of film, There is the button of one ' more ', click the button, it may appear that 10 film informations, pull-up load reload out 10 film letters Breath, until having loaded, as shown in Fig. 4 (c).
S4:IT news is checked
Click the IT in the navigation menu of bottom, it may appear that as shown in Fig. 5 (a), the page is by carousel figure and IT news information Composition, carousel figure is shown in turn by three figures, is clicked every and is schemed, may browse through the details of the IT news, including picture, Time, source, title and detailed content are delivered, as shown in Fig. 5 (b), these data are acquired by interface from server, number According in the non-relational database MongoDB of storage on the server, getting data is json format;It is every below carousel figure IT news information all by picture, title, keyword and is delivered the time and is formed;Load can be drawn above, load 10 every time newly Information is heard, until having loaded;It can pull down and refresh at top, show newest 10 IT news informations.
S5: cross-talk information inspection
Click the cross-talk in the navigation menu of bottom, it may appear that as shown in fig. 6, the data of this page are climbed by crawlers It obtains, is stored in the MongoDB non-relational database of server, the interface write by the django frame of Python It provides and calls, the data that get are shown on the page by processing, and every interesting episode record is all the head portrait, close by publisher Title, word content, laughable number and comment array are at showing 10 records first, request 10 again by lower stroke of load Record, until data have loaded, it is therefore an objective to which the burden for reducing server provides the loading velocity of small routine.
To sum up, a kind of method for exhibiting data based on wechat small routine and crawler of the invention, passes through crawler frame Scrapy carries out each website to crawl data, by the data village crawled storage into the MongoDB database on Cloud Server, Back-end code is write to obtain the data crawled by the Django frame of Python, and interface is provided and is come to wechat small routine Data are obtained to be shown.The present invention takes full advantage of being not necessarily to installation, saving the advantages of memory for wechat small routine, uses Crawler technology crawls data, solves the problems, such as that information could be browsed by installing App in the past.

Claims (8)

1. a kind of method for exhibiting data based on wechat small routine and crawler, which comprises the following steps:
Step 1: the page being write first, write using wechat developer tool the page code of module, in this work A project is first created in tool, the project of creation carries two file pages and utils and four file, four files point It is not app.js, app.json, app.wxss and project.config.json;Each file and each file have not With function, each file represents each module in pages file, i.e. module file presss from both sides, each module file folder There are four files, are js, json, wxml and wxss file respectively, and wherein js file is to write js code, handle the dynamic effect of the page Fruit and data acquisition, json file are about this module configuration information, and wxml is the page framework code of this module, last wxss File is page rendering code, and four files form a module;
Step 2: completing writing for the page of step 1, need further exist for dynamically showing the data of the page;In software In PyCharm, by inputting scrapy startproject projectName order in the console, one is created The crawler project of Scrapy frame, projectName are entry names, then into projectName catalogue, then are being controlled Scrapy genspider scrapyName URL order is inputted in platform, creates a crawler file, scrapyName is crawler Filename, URL are the addresses of crawler website, input scrapy crawl scrapyName to run project;Movie module, IT News template, cross-talk module data be all to be crawled from three different web sites, the crawler of each module will create one Crawler file crawls different data, the data crawled is stored onto the mongoDB database on Cloud Server, in crawler In file, attribute name, allowed_domains and start_urls are the title of the crawler, domain name title and target respectively The URL of website, function parse are the logical codes for handling data, and obtained page code is obtained number of targets by xpath According to, then it is packaged into object, the object of encapsulation is transmitted to by pipelines.py file by yield scrapy.Request code, In the process_item function of this document, parameter item is exactly packaged object;
It next is exactly that data are transmitted on the mongoDB database on Cloud Server, which is non-relational database, The field of every data can be different field, suitable for storage crawler data;After data obtain, need in wechat little Cheng Data are obtained in sequence, that is, need interface (API), by calling interface by the data of mongoDB database on Cloud Server It gets and returns to wechat small routine, the data of return are shown the page in wechat small routine by wechat small routine.
2. a kind of method for exhibiting data based on wechat small routine and crawler according to claim 1, which is characterized in that step It include that entitled welcome file is created in pages file in rapid 1, which is authorization login module, in welcome Tetra- files of welcome.js, welcome.json, welcome.wxss and welcome.wxml are respectively created in file, often A file has different functions;The page code of ' authorization logs in ' module is write in welcome.wxml file, it is main to use It is to label<view>,<image>,<button>with<text>,<view>label is one page area of creation, in this area Content is filled in domain, in similar Html<div>label,<image>label is to be used for exhibiting pictures,<button>label is used for Creation ' authorization logs in ' button,<text>label is to show text information;Welcome.js file is about trigger event Code can execute bindGetUserInfo function, record registrant's in the function when click ' authorization logs in ' button Information, and by page jump to text and word modules, displayed page text can execute the post.js of posts module with before word modules OnLoad method in file can load the data record to be shown, every data record packet in this function from this ground A picture information and text information are included, welcome.json is mainly the information for registering the module, the back provided with navigation bar Scape color, welcome.wxss file are rendered to the page, and the page is made to seem neatly to be aligned, and the code of the inside is exactly Css code in Html.
3. a kind of method for exhibiting data based on wechat small routine and crawler according to claim 1, which is characterized in that step It include that the file of entitled movies, i.e. movie module are created in pages file in rapid 1, this module is about film The displaying of data;In movies file create four files, be respectively movies.js, movies.json, Movies.wxss and movies.wxml clicks ' film ' button in the navigation menu of bottom, can enter the homepage of movie module, Show the film of three kinds of classification, that is, the film shown, the film that will be shown and bean cotyledon Top250 film, in homepage Three films for first showing each classification, can click subsequent ' more ' buttons of each classification can check all of the category Film;Region by clicking film can check the details of portion's film, and show time including film is shown Place, director names and picture, actor name and picture, synopsis, comment number.
4. a kind of method for exhibiting data based on wechat small routine and crawler according to claim 1, which is characterized in that step It include that the file of entitled itnews, i.e. IT news template are created in pages file in rapid 1, this module is new about IT Hear data displaying, in itnews file create four files, be respectively itnews.js, itnews.json, Itnews.wxss and itnews.wxml clicks ' IT ' button in the navigation menu of bottom, can enter the homepage of IT news template, ' IT ' button is clicked, can be entered in the onLoad function of itnews.js, be passed through according to the tabBar registered in app.json Url goes to obtain data, and the data that will acquire are transmitted to itnews.wxml and are shown;In the homepage of the module, head is used <scroll-view>component rolls IT news picture, and every row below represents a data, every data includes news figure Piece, headline and news keyword and issuing time, are employed herein left-right layout;Every record is clicked, can be triggered in js ToDetail function, can obtain this record details, details include deliver time, source, title, news Detailed description.
5. a kind of method for exhibiting data based on wechat small routine and crawler according to claim 1, which is characterized in that step Include creating the file of entitled joke in pages file in rapid 1, is cross-talk module, this module is about cross-talk data Displaying, in joke file create four files, be respectively joke.js, joke.json, joke.wxss and Joke.wxml, clicks ' cross-talk ' button in the navigation menu of bottom, and the homepage of meeting approach section submodule uses in the homepage Component<scroll-view>slides vertically cross-talk data, and every cross-talk data include the picture of utterer, user name, cross-talk Information, laughable number and comment number, every cross-talk are all that can load data again when scroll bar is rolled into low side with the same layout It is shown.
6. a kind of method for exhibiting data based on wechat small routine and crawler according to claim 1, which is characterized in that step Include: in rapid 2 when click bottom navigation menu in ' film ' button when, first can be according to tabBar in app.json " PagePath ": " pages/movies/movies " enters in movie module, can execute movies.js file first, enters In onLoad function, this function initialization function, for loading data;In this function, obtained by interface remote Data on Cloud Server in mongoDB database, and this interface is write by the Django frame in Python, passes through The path path in the urls.py file of Django frame, a total of six interface, three interfaces are for obtaining film information, i.e., Movie name, movie picture and scoring, other three interfaces are the details for obtaining film;When calling different interfaces, meeting The distinct methods for executing views, obtain different classes of cinematic data, and return data to onLoad function, in onLoad After receiving data in function, data are transmitted to the movies.wxml page and are shown.
7. a kind of method for exhibiting data based on wechat small routine and crawler according to claim 1, which is characterized in that step Include: in rapid 2 when click bottom navigation menu in ' IT ' button when, first can be according to tabBar in app.json " PagePath ": " pages/itnews/itnews " enters in IT news template, can execute itnews.js file first, enters In onLoad function, this function initialization function, for loading data;In this function, obtained by interface remote Data on Cloud Server in mongoDB database, by the path path in the urls.py file of Django frame, in total Two interfaces, one of interface are to obtain IT news information, i.e. headline, news picture, news keyword and when delivering Between, and another interface is the details for obtaining IT news;When calling different interfaces, the not Tongfang of views can be executed Method obtains different news datas, and returns data to onLoad function, will after receiving data in onLoad function Data are transmitted to the itnews.wxml page and are shown.
8. a kind of method for exhibiting data based on wechat small routine and crawler according to claim 1, which is characterized in that step Include: in rapid 2 when click bottom navigation menu in ' cross-talk ' button when, first can be according to tabBar in app.json " PagePath ": " pages/joke/joke " in approach section submodule, can execute joke.js file, into onLoad letter first In number, this function initialization function, for loading data.In this function, cloud service is obtained by interface remote Data on device in mongoDB database, by the path path in the urls.py file of Django frame, i.e. path (" Get_duanzi/ ", views.getDuanZi), obtain cross-talk data: user's head portrait, user name, cross-talk content, laughable number and Number is commented on, when calling the interface, the getDuanZi method of views can be executed, obtain cross-talk data, and return data to Data after receiving data in onLoad function, are transmitted to the joke.wxml page and are shown by onLoad function.
CN201910417546.XA 2019-05-20 2019-05-20 A kind of method for exhibiting data based on wechat small routine and crawler Pending CN110263266A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910417546.XA CN110263266A (en) 2019-05-20 2019-05-20 A kind of method for exhibiting data based on wechat small routine and crawler

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910417546.XA CN110263266A (en) 2019-05-20 2019-05-20 A kind of method for exhibiting data based on wechat small routine and crawler

Publications (1)

Publication Number Publication Date
CN110263266A true CN110263266A (en) 2019-09-20

Family

ID=67914823

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910417546.XA Pending CN110263266A (en) 2019-05-20 2019-05-20 A kind of method for exhibiting data based on wechat small routine and crawler

Country Status (1)

Country Link
CN (1) CN110263266A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110765399A (en) * 2019-10-17 2020-02-07 苏州市那美网络科技有限公司 Intelligent page configuration system based on retail official network and implementation method thereof
CN110866059A (en) * 2019-11-15 2020-03-06 南京师范大学泰州学院 WeChat applet-based competition field scoring system
CN111414525A (en) * 2020-03-25 2020-07-14 深圳市腾讯网域计算机网络有限公司 Data acquisition method and device for small program, computer equipment and storage medium
CN112199137A (en) * 2020-09-01 2021-01-08 北京达佳互联信息技术有限公司 Display method and device of login interface, electronic equipment and storage medium
CN112256833A (en) * 2020-10-23 2021-01-22 清华大学深圳国际研究生院 Intelligent question answering method for mobile phone based on big data and AI algorithm
CN113035300A (en) * 2021-04-20 2021-06-25 西安交通大学口腔医院 Children's caries risk assessment system
CN113836450A (en) * 2021-11-30 2021-12-24 垒知科技集团四川有限公司 Data interface generation method for acquiring XPATH based on visual operation
CN116112695A (en) * 2022-12-28 2023-05-12 安胜(天津)飞行模拟系统有限公司 Distributed flight simulation training playback system and method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106649810A (en) * 2016-12-29 2017-05-10 山东舜网传媒股份有限公司 Ajax-based news webpage dynamic data grabbing method and system
CN106874487A (en) * 2017-02-21 2017-06-20 国信优易数据有限公司 A kind of distributed reptile management system and its method
CN109388735A (en) * 2018-09-13 2019-02-26 广州丰石科技有限公司 A method of crawling wechat public platform information

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106649810A (en) * 2016-12-29 2017-05-10 山东舜网传媒股份有限公司 Ajax-based news webpage dynamic data grabbing method and system
CN106874487A (en) * 2017-02-21 2017-06-20 国信优易数据有限公司 A kind of distributed reptile management system and its method
CN109388735A (en) * 2018-09-13 2019-02-26 广州丰石科技有限公司 A method of crawling wechat public platform information

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
BOBBYKEY: "在pycharm中调试运行scrapy", 《HTTPS://BLOG.CSDN.NET/BOBBYKEY/ARTICLE/DETAILS/79439326》 *
QQ_33272254: "微信小程序学习(二)项目目录结构", 《HTTPS://BLOG.CSDN.NET/QQ_33272254/ARTICLE/DETAILS/84196273》 *
唐明森: "高校网络舆情监测系统微信平台研发", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110765399A (en) * 2019-10-17 2020-02-07 苏州市那美网络科技有限公司 Intelligent page configuration system based on retail official network and implementation method thereof
CN110866059A (en) * 2019-11-15 2020-03-06 南京师范大学泰州学院 WeChat applet-based competition field scoring system
CN111414525A (en) * 2020-03-25 2020-07-14 深圳市腾讯网域计算机网络有限公司 Data acquisition method and device for small program, computer equipment and storage medium
CN111414525B (en) * 2020-03-25 2024-01-02 深圳市腾讯网域计算机网络有限公司 Method, device, computer equipment and storage medium for acquiring data of applet
CN112199137A (en) * 2020-09-01 2021-01-08 北京达佳互联信息技术有限公司 Display method and device of login interface, electronic equipment and storage medium
CN112256833A (en) * 2020-10-23 2021-01-22 清华大学深圳国际研究生院 Intelligent question answering method for mobile phone based on big data and AI algorithm
CN112256833B (en) * 2020-10-23 2024-02-27 清华大学深圳国际研究生院 Mobile phone problem intelligent question answering method based on big data and AI algorithm
CN113035300A (en) * 2021-04-20 2021-06-25 西安交通大学口腔医院 Children's caries risk assessment system
CN113836450A (en) * 2021-11-30 2021-12-24 垒知科技集团四川有限公司 Data interface generation method for acquiring XPATH based on visual operation
CN116112695A (en) * 2022-12-28 2023-05-12 安胜(天津)飞行模拟系统有限公司 Distributed flight simulation training playback system and method

Similar Documents

Publication Publication Date Title
CN110263266A (en) A kind of method for exhibiting data based on wechat small routine and crawler
Rogers Digital methods
AU2008307247B2 (en) System and method of inclusion of interactive elements on a search results page
CN102968495B (en) The vertical search engine of search contrast association shopping information and method
US8898150B1 (en) Collecting image search event information
CN103902533B (en) It is a kind of to search for through method and apparatus
US20140115439A1 (en) Methods and systems for annotating web pages and managing annotations and annotated web pages
Hanjalic et al. The holy grail of multimedia information retrieval: So close or yet so far away?
CN109791680A (en) Key frame of video on online social networks is shown
CN115087984A (en) Method, computer-readable medium, and system for creating, organizing, viewing, and connecting annotations
CN100462969C (en) Method for providing and inquiry information for public by interconnection network
CN105706081B (en) Structured message link annotation
US20100131495A1 (en) Lightning search aggregate
CN106874502A (en) A kind of method of video search, device and terminal
US8290944B2 (en) Method for storing bookmarks for search results from previously submitted search queries by a user and storing links to selected documents by the user
Khan et al. A relational aggregated disjoint multimedia search results approach using semantics
Perugini Supporting multiple paths to objects in information hierarchies: Faceted classification, faceted search, and symbolic links
Menard et al. Digital image access: an exploration of the best practices of online resources
Qiu et al. Evaluating access mechanisms for multimodal representations of lifelogs
Viljanen et al. Publishing and using ontologies as mashup services
Wanjari et al. Automatic news extraction system for Indian online news papers
Bertini et al. Interactive multi-user video retrieval systems
US9135313B2 (en) Providing a search display environment on an online resource
Yang et al. Content-based retrieval of Flash™ movies: research issues, generic framework, and future directions
He et al. Towards building a metaquerier: Extracting and matching web query interfaces

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination