Detailed Description
In the prior art, when dealing with case-auditing tasks of data anomalies such as credit-swiping, telecom fraud, anti-money laundering, etc., the traditional information is retained in two ways.
The first information retention mode: the operator captures various related data information in each related website through capture software, and then jumps to the website related to the case task to manually upload the captured image through an uploading interface. The defect of the information trace mode is that a cross-domain limitation problem exists, namely when a browser requests a resource of another domain name from a webpage of the domain name, the browser cannot directly access the resource of the other domain name due to the fact that any one of the domain name, the port and the protocol is different. Specifically, since the screenshot action and the task system are respectively realized by different pages or applications (domain names, ports and protocols between the pages and the applications are usually different), the screenshot action and the task system cannot be linked with each other at all. After an information page needing screenshot is manually returned to an original task page, the screenshot can be uploaded by using an uploading interface of the original task page. Meanwhile, the screenshot type needs to be manually selected during screenshot each time. Moreover, in the currently supported selective screenshot mechanism, the processing mode is single, and large-scale web pages can be completed only by splicing for many times. Therefore, the first traditional manual information retention mode is time-consuming and labor-consuming, and the case task management process is easy to interrupt. For example, when an anti-money laundering organization carries out the examination of an anti-money laundering task, since the examination and judgment of the anti-money laundering task needs a large amount of evidence information, such as enterprise investigation (an enterprise information inquiry tool), user pictures, relationship networks, user transactions and the like, and the evidence itself does not have a good structure, the best mode of retention is screenshot. However, the corroborative information is usually scattered on different websites, and the corroborative information is returned to an original task processing page for image retention and uploading after screenshot of the information every time, which undoubtedly slows down the task audition efficiency and easily interrupts the consistency of audition thought.
And a second information retention mode, namely automatically carrying out screenshot retention according to the information. The disadvantage is that the sources of information relating to data anomalies are scattered, and the formats of the information of the various sources are usually different, so that it is difficult to format the information uniformly. Moreover, this approach also does not overcome the cross-domain restriction problem. The page where the information is located is not provided with a corresponding uploading interface or crawler access is prohibited, so that cross-domain uploading operation after automatic screenshot cannot be realized, and the adaptation to scenes is limited greatly.
Therefore, the conventional information retention method is complicated in steps and low in efficiency. Furthermore, a solution is needed that can reduce unnecessary information retention operation steps and effectively increase the speed of retaining the information of the trial case by the host.
In FIG. 1, a runtime platform environment is shown in accordance with one embodiment of the present disclosure. In the operating platform environment, the client 110, various websites 120(1), (…), (120 (n)), and the task auditing system 130 are mainly included, and are connected with each other through the network 140. The client 110 may be a computing device, such as a personal computer, a mobile device (e.g., a mobile phone, a personal digital assistant, a tablet, etc.), and other devices with data retrieval capabilities. The website 120 may be, for example, a server, an e-commerce platform, a payment platform, a banking system, a government agency, an enterprise agency, etc., that stores information related to data anomalies. And the task triage system 130 may be a site, such as a server, workstation, database system, data center, etc., capable of analyzing and processing the collected anomaly data information. The network 140 may include various types of wired and wireless networks including, but not limited to, the internet, local area networks, WIFI, WLAN, cellular communication networks (GPRS, CDMA, 2G/3G/4G/5G cellular networks), satellite communication networks, and the like. Data communication may be provided between the client 110, the various websites 120, and the task approval system 130 via the network 140. In this runtime platform environment, the client 110 is operated by an operator to perform collection of data information such as a large number of transactions, transfers, withdrawals, deposits, etc., relating to data anomalies from the various websites 120, and the task auditing system 130 can provide a corresponding task auditing page to upload the collected information to the database of the task auditing system 130 for analysis processing to implement case auditing.
Now, the specific structures of the client 110 and the task auditing system 130 will be described with reference to fig. 2. In FIG. 2, a modular block diagram of the client 110 and the task auditing system 130 in the runtime platform environment is shown, according to one embodiment of the present disclosure. First, a browser 112, which may be, for example, a Google Chrome, is included on the client 110. The browser allows a user to access web pages on various websites to retrieve data anomaly information. Also, the browser may include a retention module 114. The retention module 114 provides a screenshot mode selection module and a cross-domain access module. Specifically, the screenshot mode selection module in the retention module 114 may apply a selected type of screenshot mode (e.g., one of a visible page screenshot, a full page screenshot, a selective screenshot, and a picture composition screenshot) on a display page of the currently visited website 120 according to a selection of the user to obtain a required screenshot. The cross-domain access module in the retention module 114 may utilize cross-domain technology to help the browser obtain the audit task page and task information from the task auditing system 130 located in another domain without jumping to the other domain, and automatically upload the retention screenshot along with the form to the task auditing system 130 to perform the case auditing task after automatically populating the upload form in the page.
It should be understood that the Google Chrome is just one example of the browser, and that other browsers may be used to access information, screenshots, and cross-domain uploads, as long as the corresponding persistence module 114 is installed for them. It should also be understood that in addition to providing the retention module 114 in the form of a browser plug-in, the retention module 114 may also be script, separate application, APP, applet, and other forms of code. Fast retention of data exceptions can be achieved by installing and running these program codes directly, not necessarily with the help of a browser. Accordingly, these code forms are also within the scope of the present disclosure.
And a task information interface 132 is included in the task review system 130 through which review task pages and task information can be provided to the client 110 for the retention module 114 to automatically fill in forms in the task pages. The task approval system 130 also includes a task attachment upload interface 134 that can receive and upload a retention picture (screenshot) from the retention module 114 of the client 110.
A detailed description of the operation of the retention module 114, the task information interface 132, and the task attachment upload interface 134 is described in greater detail below in the example process described in conjunction with fig. 3.
With the basic operating environment and structure of the present disclosure in mind, an example flow for fast persistence of data exceptions in accordance with one embodiment of the present disclosure is depicted in FIG. 3.
In describing the flow, for ease of understanding, we will also illustrate the trial and error task of the anti-money laundering case as an example of a data exception handling scenario. One example of selecting an anti-money laundering case as a data exception handling task is because: first, money laundering is a large body of instruments, and almost all services offered by banks are likely to be utilized by money laundering workers. Second, numerous domestic and foreign commercial companies and financial institutions may be involved in the entire process from the source of the black money until the final destination is reached in a money laundering conspiracy. In addition, with the development of internet finance, various online payment and transfer means are convenient and rapid, and more abundant means are brought to money laundering activities. These factors have resulted in the need to visit a large number of money flow-related websites and intercept and upload screenshots of large amounts of relevant money flow data to survive as crime evidence in every audit of anti-money laundering cases. Specifically, when conducting an examination of an anti-money laundering case, an operator is required to collect and retain data information on a large number of transactions, transfers, withdrawals, deposits, and the like, involving money laundering, from various websites. This evidence collection and persistence process may be referred to as an "information retention" operation. During evidence collection persistence, tens or even hundreds or even thousands of information retention operations may typically be involved, each of which may involve a different website and a different data information format. Therefore, the best way to retain information for such unstructured evidence information is by screenshot. In the conventional technology, the evidence information on each website needs to be manually captured by an operator, and then the evidence information is switched to the task auditing page again and again, and then uploaded to the task auditing system 130 again and again. Therefore, the retention of information on each web page requires a series of manual operations such as "open website, browse information, select screenshot, switch to audit task page, fill up upload form, upload screenshot" to complete. Such a workload is clearly an inexpedient task for the auditing of anti-money laundering cases involving hundreds or even more of website transactions, and not only takes a great deal of effort from the clerk, but is also extremely inefficient. Therefore, the need for fast and efficient retention handling of data anomalies in the field of anti-money laundering is particularly acute.
Those skilled in the art will readily appreciate that the disclosed solution is not limited to the trial of anti-money laundering cases, but may be applied in a wide variety of scenarios where evidence needs to be collected from many websites, such as, for example, review, telecom fraud, patent infringement suits, and the like. In these similar scenarios, various data exception evidences need to be collected and retained by browsing a large number of websites and screenshot, and therefore, the present disclosure can be applied to various data exception handling scenarios.
In order to improve the efficiency of the trial and error while reducing human labor, the present disclosure provides a fast retention mechanism. The fast retention mechanism may enable fast retention of data exceptions by utilizing a retention module 114 in the browser as described in FIG. 2.
Specifically, as shown in FIG. 3, at step 310, the user begins browsing a web page that may be involved in money laundering activities. When relevant information such as abnormal fund flow is found, a retention operation needs to be performed on the information of the webpage. This may activate the persistence operation by clicking on the persistence module icon in the browser menu bar. The persistence module 114 may be pre-installed in the browser in a plug-in form and add the corresponding icon to the browser's menu bar. Alternatively, in another embodiment, as previously described, the retention module 114 may also be a pre-installed stand-alone application, APP, applet, or the like, and the user may activate the retention module 114 by clicking on an icon associated therewith, for example, on the desktop. In other embodiments, the user may be reminded to install the plug-in or the application on site to activate the retention operation when the information retention is required according to the prompt of the browser.
After the retention module 114 is activated, in step 312, the user may be asked to select a desired screenshot mode from the multiple screenshot modes provided by the capture mode selection module in the retention module 114 to apply to the currently browsed web page that needs to be retained. Alternative screenshot modes may include, for example:
(1) visible page screenshot mode: screenshot is carried out based on the current visible Web page;
(2) full webpage screenshot mode: if the current webpage is large and can be viewed only by scrolling, the webpage can be automatically scrolled during screen capture so as to realize complete screen capture of the whole webpage;
(3) selective screenshot mode: the screenshot can be firstly captured on the whole webpage, and then the screenshot is cut according to the specific area planned by the user to obtain the required screenshot; or a selection box is provided to allow the user to select a desired content to be screenshot by dragging and dropping the selection box on the web page.
(4) And (2) a combined screenshot mode, namely reorganizing (for example, splicing) the screenshots into a new picture after the screenshots are processed.
The four screenshot modes listed above are just a few examples of the screenshot modes that the screenshot mode selection module can provide. In fact, other suitable screenshot modes can also be added to the screenshot mode selection module as needed, such as full screen screenshot, window screenshot, delayed screenshot, and so on, which are not illustrated herein.
Moreover, it should be appreciated that the screenshot modes provided by the screenshot mode selection module have in fact been widely adopted in many software applications, and thus, the construction of the screenshot mode selection module may be accomplished by calling existing screenshot function functions. Therefore, a detailed description thereof is not necessary here either.
After the user selects one of the screenshot modes provided by the screenshot mode selection module, in step 314, the web page is screenshot in a corresponding screenshot mode according to the screenshot mode selected by the user, and a retained picture is generated.
For example, when the user selects the "visible page screen shot mode," the retention module 114 directly screens the currently visible page of the web page as the screen shot to be retained. And if the user selects the 'full webpage screenshot mode', the saving module automatically scrolls the webpage while screenshot until the complete webpage is intercepted as the screenshot to be saved. Both screenshot processes can be automatically completed by the system without manual operation of a user. When the user selects the "selective screenshot mode" or the "combined screenshot mode", the user is required to generate an appropriate saved picture by, for example, manually moving the selection box or adjusting the positions of the plurality of screenshots. These are all screenshot means often used by those skilled in the art and are not presented in detail here.
After completing the retained picture generation operation via the selected screenshot mode, the cross-domain access module of the retention module 114 may utilize cross-domain techniques to obtain case task information currently being performed by the user directly from the task administration system 130 across domains without jumping to a task administration page provided by the task administration system 130 in a domain different from the current web page involved in the money laundering activity.
In the conventional technology, after a reserved picture is generated from a website to be reserved through screenshot, an operator generally needs to log out a browser from the current reserved website and jump to a special auditing site provided by the task auditing system 130 for logging in, load a corresponding auditing task page to acquire information related to a case task, fill an uploading form, and submit a corresponding screenshot for information reservation. This is because the web page to be retained and the audit site are usually located in different domains, and they cannot be accessed directly by cross-domain communication.
In particular, a common audit task page is shown in FIG. 4. In the present embodiment, a review task page provided by a conventional "enterprise review" website is described as an example. It should be understood that other audit task pages may actually have similar pages and functionality.
According to fig. 4, it can be found that the search function of the enterprise information is provided on the left side of the audit task page. The searched enterprise results can be displayed on the lower side according to the keywords input by the user in the upper search bar and the selected conditions in the filter bar. On the right side of the examination task page, an examination task uploading form is provided, and the form comprises various information (such as case numbers, certificate types, customer numbers and the like, part of the information is subjected to fuzzification processing for privacy requirements, but the fuzzification processing does not influence understanding of functions of the page), an option box of an 'attachment type' (such as identity information, enterprise information, transaction information and the like), and a special interface for uploading attachments (such as screenshots). The information retention operation is completed only by manually completing various information of the form, selecting the type and position of the attachment (namely, the screenshot), and then clicking a 'submission' button to submit the retained information.
Therefore, in the existing case information retention mode, a user first jumps to an examination task page of the task examination system 130, fills in information related to cases one by one, and finally retains the information (such as a screenshot) by submitting attachments. When there are hundreds of pages that need to retain pictures, this means that after each screenshot, the browser needs to jump back to the review task page of the task review system 130 and then manually upload it as described above. This results in a significant expenditure of manpower and resources for the auditing process for each case.
In the solution of the present disclosure, the cross-domain access module of the retention module 114 provides a cross-domain technology, so that the related case task information and form information can be directly obtained and uploaded after automatic filling without jumping to an audit task page.
Before describing the cross-domain technique, first, the cause of the cross-domain restriction problem is known. The XMLHttpRequest object is generally used in existing browsers and general web pages to send or receive server data. Specifically, the XMLHttpRequest object for accessing the server resource is a standard API of W3C. The XMLHttpRequest object supports a plurality of text formats, such as XML, JSON and the like. The XMLHttpRequest object may send the request over HTTP and HTTPs. The XMLHttpRequest object of a Web page is restricted from accessing servers of other domains, usually for security reasons. However, although the cross-domain restriction improves the security of the system, it also seriously affects the data communication between systems in different domains, which is the origin of the cross-domain restriction problem.
Through the above analysis, it is understood that the cross-domain restriction is actually implemented by setting a security policy for the browser to restrict cross-domain access, and therefore, if the security restriction on the browser can be released, data loading, access and uploading to another domain can be simultaneously implemented in a web page of a certain domain. For this purpose, the cross-domain access module of the retention module 114 may be extended for a browser (in this embodiment, Google Chrome is taken as an example) to implement the cross-domain technology. Specifically, by installing a plug-in of the retention module 114 or performing retention modification on the set code of the Chrome browser, the corresponding security policy of the Chrome browser for cross-domain restriction can be closed, so that the Chrome browser has a right to access the cross-domain (i.e., the function of the Chrome browser is extended), and thus, the XMLHttpRequest object used by the extended Chrome browser can access the server of any declared domain.
The following describes how to implement the Chrome browser extension with a specific example.
When no other rights are obtained, the extension can use XMLHttpRequest to obtain resources from the domain in which the extension is installed. For example, assuming that an extension contains a JSON profile called config.
var xhr=new XMLHttpRequest();
xhr.onreadystatechange=handleStateChange;//Implemented elsewhere.
xhr.open("GET",chrome.extension.getURL('/config_resources/config.json'),true);
xhr.send();
If an extension wants to access resources outside of its own domain, such as resources from http:// www.goo gle.com (assuming the extension itself is not from www.google.com), then a conventional browser will not allow such a request unless the extension gets a corresponding cross-domain request permission.
In order to obtain the permission of the cross-domain request, a domain name can be added or matched to the permissions section of the manifest file, so that the extension has the access right of accessing other domains except the domain to which the extension belongs. The specific operation is as follows:
the cross-domain allowed setting can use a complete domain name, such as http:// www.google.com/, http:// www. Or using pattern matching, such as http://. google.com/or "http://". The pattern match "HTTP:///" indicates that HTTP requests to all domains can be initiated. Note that here, the pattern matching is somewhat like a content script matching, but here any path information after the domain name is ignored. It is also noted that access rights are granted based on access protocol (http or https in a matching pattern or other protocol name) and domain name. For example, if an extension wishes to access a domain or domains based on both https and http protocols, it must acquire access permissions based on both protocols (similar to the following statement):
(the above-mentioned content related to the extension is transferred from the development document of 360Chrome, while reference is made to Chrome extension)
By using the codes, XMLHttpRequest can be allowed to obtain the cross-domain request. Of course, as described above, the cross-domain restriction problem may also be solved directly by installing corresponding plug-ins, scripts, applications, or APPs.
In this way, the cross-domain access module of the retention module 114 can extend the Chrome browser in various forms such as setting, coding, or plug-in to break the cross-domain restriction. Thus, when the process proceeds to step 316 after completing the operation of saving the picture through the screenshot at step 314, the saving module 114 may directly access the audit task page across domains by using the extended Chrome browser and obtain the currently auditing case task information and form without jumping to the task audit page provided by the task audit system 130 located in another domain, for example, by calling the task information interface 132 of the task audit system 130 shown in fig. 2 across domains. Meanwhile, in step 318, the retention module 114 may also automatically identify what type of information the information provided by the web page belongs to, in combination with information such as the address and/or content of the web page that was previously captured, for example, to which type of identity information, business information, transaction information, etc. in the "attachment type" item shown in fig. 4 the retention picture intercepted from the web page is associated, so as to perform classified uploading.
After the case task information is obtained and the category to which the screenshot web page belongs is identified, in step 320, the retention module 114 automatically fills the uploaded form in the obtained task auditing page, for example, automatically finds matching information to fill in fields such as case number, certificate type, attachment type, etc. in the form according to the obtained case task information and the category to which the case task information belongs. The retention module 114 then utilizes cross-domain techniques to automatically submit the form to the task authority system 130 along with the retention picture previously generated by the screenshot (e.g., as an attachment to the form), which may be accomplished, for example, by cross-domain calling the task attachment upload interface 134 of the task authority system 130 as shown in FIG. 2. At this point, the fast persistence mechanism is complete.
According to the above process, it can be understood that after the screenshot operation is completed on the webpage to be retained in the current domain, the user does not need to jump to a task management page in another domain, and a series of cross-domain data operations such as reading of task management data in another domain, filling of a form, uploading of an attachment and the like can be directly and automatically completed in the current domain. That is, steps 316, 318, and 320, which should be executed after jumping to a task-approval page in another domain, can all be completed in the current domain along with the previous steps in the disclosed solution.
Thus, for each information retention operation in a case involving a large number of information retention operations, such as anti-money laundering case audits, the user need only open the web page to find the information and select a suitable screenshot mode, all the following steps being automated. Therefore, the quick retention mechanism can functionally realize multiple screenshot modes such as visible page screenshot, full webpage screenshot, selective screenshot, combined screenshot and the like according to actual business needs, achieves comprehensive coverage of screenshot requirements, supports real-time uploading of current screenshot retention information to a task auditing system under the condition that switching to a task auditing page under another domain name is not needed, and effectively improves retention efficiency.
Particularly, compare with traditional technique of reserving, this disclosure supports multiple screenshot mode simultaneously, directly carries out automatic concatenation to large-scale webpage, still supports multiple mode mixed concatenation simultaneously, covers multiple service scene for information is reserved and is become convenient and fast. In addition, the quick retention mechanism disclosed by the invention realizes linkage of the screenshot and the auditing system, an operator only needs to activate the retention module and select a required screenshot mode when visiting a webpage containing abnormal data, and the system can automatically complete the steps of screenshot, cross-domain acquisition, auditing task information acquisition, automatic type identification, form filling, retained picture uploading and the like, thereby realizing the function of one-key retention. In addition, as the user edits based on the browsing page of the client, any page which can be accessed can be marked, the trial range is particularly wide, and the problem of marking of information of certain pages which are forbidden to be accessed by crawlers can be solved.
It should be understood that the above described modular structure and flow are merely exemplary illustrations of the fast retention mechanism of the present disclosure, and do not limit the scope of the present disclosure thereto. Those skilled in the art will appreciate that the described processes and environments may be modified as appropriate for the particular needs of their implementation.
For example, in one embodiment, in addition to the Chrome browser of Google, other Chrome kernel based browsers may be used, which, because of the core code of Chrome, can also implement cross-domain access by, for example, the illustrated extended manner of removing the cross-domain restriction of the browser. Moreover, even a browser that is not a Chrome kernel, such as microsoft IE browser, can implement the cross-domain by modifying the restriction policy for cross-domain access in its kernel program.
In another embodiment, in addition to modifying the cross-domain settings of the Chrome browser, there are actually other technical means to solve the cross-domain problem, such as jsonnp cross-domain, window name + iframe cross-domain, nginx proxy cross-domain, and other cross-domain technologies that can implement cross-domain access to resources of other domains. Thus, the above modification scheme based on settings of Chrome browser is only one specific example of a cross-domain scheme used in the present disclosure, and other suitable cross-domain means may be used in the scheme of the present disclosure.
Also, although the persistence module 114 is described in the above embodiments as a browser plug-in, it should be understood that the persistence module may be implemented in various forms such as a script, a separate application, an APP, an applet, and the like, and is not limited to being a plug-in to a Chrome browser. The retention script, application or APP integrates screenshot mode selection functionality and cross-domain techniques as described above, so that the information retention operation can be performed even if the user uses a browser that is not a Chrome kernel, even without using a browser. Such a solution has a better versatility and,
in other embodiments, the retention module may have more or fewer screenshot modes, and is not limited to the four depicted. For example, when a page has embedded therein special display windows of content such as video, Word documents, PDF documents, etc., a screenshot mode may be provided in which only the contents of these special windows are intercepted. As another example, when the top, bottom, or sides of a web page contain a large amount of advertising content, a center-shot (i.e., "edge-cut") mode may be provided in which only the center portion of the web page is shot. As well as full screen shots, delayed shots, and the like. These screenshot modes can be one of the screenshot modes of the present disclosure for the user to select, and are not listed here.
Additionally, in some embodiments, if the task information page is simple, e.g., only one type of information, the populated form can be uploaded to the task approval system 130 with the retained pictures directly from the task information without step 318.
Although in the above embodiments and the accompanying drawings, the anti-money laundering case trial is mainly described as an example of the data anomaly trial, it should be understood that, as described above, the fast-persistence scheme of the present disclosure can be applied to various other application scenarios that require data anomaly screenshots to be collected from a large number of websites, such as anti-credit-swiping, anti-telecom-fraud, and the like.
The foregoing description of specific embodiments of the present disclosure has been described. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous. Moreover, those skilled in the relevant art will recognize that the embodiments can be practiced with various modifications in form and detail without departing from the spirit and scope of the present disclosure, as defined by the appended claims. Thus, the breadth and scope of the present disclosure should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents.