GB2524491A - Multilingual system and corresponding method - Google Patents

Multilingual system and corresponding method Download PDF

Info

Publication number
GB2524491A
GB2524491A GB1405190.8A GB201405190A GB2524491A GB 2524491 A GB2524491 A GB 2524491A GB 201405190 A GB201405190 A GB 201405190A GB 2524491 A GB2524491 A GB 2524491A
Authority
GB
United Kingdom
Prior art keywords
response
language
http
request
content
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
GB1405190.8A
Other versions
GB201405190D0 (en
Inventor
Richard Sheppard
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
INTERCEPTOR SOLUTIONS Ltd
Original Assignee
INTERCEPTOR SOLUTIONS Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by INTERCEPTOR SOLUTIONS Ltd filed Critical INTERCEPTOR SOLUTIONS Ltd
Priority to GB1405190.8A priority Critical patent/GB2524491A/en
Publication of GB201405190D0 publication Critical patent/GB201405190D0/en
Publication of GB2524491A publication Critical patent/GB2524491A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/451Execution arrangements for user interfaces
    • G06F9/454Multi-language systems; Localisation; Internationalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • G06F16/972Access to data in other repository systems, e.g. legacy data or dynamic Web page generation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer And Data Communications (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

Translating web-based natural language content on the fly, or in real-time. If a natural language switch is required during a HTTP session, the HTTP response sent from a web server to a client web browser is modified by replacing at least one target item in the HTTP response with a replacement item. Target and replacement items are specified in a configuration file. The system operates in the request/response pipeline which is established between the client and web server. Request and Response objects are created as a result of a HTTP request being received from the client, and modification of the HTTP response is performed by writing the replacement item to a stream associated with the Response object. This enables content to be delivered from a web server to a client in a different natural language without affecting the underlying web application.

Description

MultiIinual System and CorrespondinQ Method This invention relates generally to server-based software, and inure parLicularly Lo multilingual web-based applications which are accessed and utilised by users with a variety of language requirements. The invention is particularly suited for use with applications which need Lu be compliant with differing user-specific requirements, such requirements arising as a result of, for example, language, environment or location-related constraints.
The present application is particularly suited for use with web-based applications, to provide a multilingual capability. In such situations, the user accesses a software application which is installed on a remote server. When a user uses a web browser (hereinafter the client') to access the application running on thc scrvcr, the client submits an HTTP request message to the server. The server, which provides resources such as HTML files and other content such as images, or performs other functions on behalf of the client, returns a response message to the client. A sequence of request-response transactions is known as a HTTP session. The response contains completion status information about (he request and may also contain requested resources in its message body.
A typical request message may consist of the following fields: I. A request line, for example GET Iiniages/picture.png FITTP/l.l, which requests a resource called /images/pieture.png from the server.
2. Request headers, such as Accept-Language: cn 3. An empty line.
4. An optional rncssagc body.
Thc Accept-language header provides a list of natural languages which the client has indicated as acceptable for the response to be formed in.
A typical response message may consist of the following: A Status-Line (for example HTTPI'l. 1 200 OK, which indicates that the client's request succeeded) * Response headers, such as Content-Type: textlhtml * An empty line * An optional message body containing the resource requested by the client.
When the client sends the request, a connection is established between the client and server. In early versions of HTTP, this connection is closed after a single request/response pair. However, I-ITTP/1.1 allows for persistent connections to be maintained such that client-server connections can be pipelined. Using pipelining, multiple requests can be sent without waiting for a response from the server. The server returns the responses in the same order as they were received from the client. This enables greater efficiency in terms of time and required resources. Another introduction to HTTP/1.1 is the ability to perform chuniced transfer encoding, wherein content transferred via persistent connections can be streamed rather than buffered.
In operation, a server uses a listening mechanism to monitor for I-ITTP requests on a certain port number (usually port 80). The server recognises when a client sends a HTTP request and creates two new objects in memory -one for the request and one for the response, The request object provides access to the information included in the HTTP request, such as the request headers and the request body. The response object enables the HYI'P response to be constructed as required e.g. by allowing headers to be set and the body content to be supplied. When the HTTP response has been completed and sent, the request and response objects are flushed from memory.
Turning to the issue of niultilingualism, many web-based applications are accessed by end users having different natural language needs. Thus, multilingualism is a very common hut costly issue for many organisations whose software users (e.g. customers, employees) speak different languages. When a software application is to be used around the world in a variety of countries, users want and need to interact with the software in their own language. Such language issues can arise even within the same country where portions of the population may be bilingual e.g. Canada or Wales. Further still, an organisation may have offices in different countries, or may simply need to communicate with clients who speak a variety of languages.
There are two approaches to providing multilingual capabilities for software: 1. to design the multilingual capability into the software from the outset; or 2. to renovate/adapt an existing application though a localisation process.
Though it is more cost-effective to design for multilingualism from the outset this approach is rarely taken as it can add significantly to development cost and time. By contrast, the localisation approach is to adapt an existing, mono-linguistic application to provide one or more additional versions for use with other languages. However, this requires considerable technical effort, and the necessary skills and resources may not be readily available when required. In addition, localisation leaves a costly legacy in the maintenance and administration of multiple versions of the software. Applications are stretched' to provide functionality which they were not originally designed to provide, sometimes giving the appearance of a hybrid system with deficiencies that are quickly apparent to end users.
Further still, both approaches are only viable options for the organisation developing the software. On the other hand, organisations using the software have only one option with respect to multilingualism: lobbying the software developer to provide the desired linguistic capability.
As a result of these difficulties, many softwarc applications do not have effective multilingual capabilities. This problem is even more significant for enterprise applications and for minority languages. Addressing the issue requires the software vendor/developer to commit to prioritise the issue and allocate resources for addressing it, not to mention the considerable cost which must be flmded in some manner.
Thus, it is desirable to provide a software solution which can provide web-based multilingual capabilities in a low cost, flexible and easily managed manner. Such an improved solution has now been devised.
thus, in accordance with the present invention there is provided a system and corresponding method as defined in the appended claims.
In accordance with the invention there may be provided a web-based multilingual system comprising at least one software component arranged to: determine whether a natural language switch is required during a I-ITTP session; and if so: identify at least one target item in a HTTP response sent from a web server to a client during the HTTP session; and modify the HTTP response to replace the at least one target item with a replacement item; wherein the target and replacement items are specified in a configuration file, The system may enable content to be delivered from a web server to a client (e.g. web browser) in a different (target) natural language other than the current or default language of the session. the target language may be chosen by a user or may be inferred. [he system may enable the user of the client to specify which natural language the content is to be delivered in. Thus, the user may switch languages when accessing a web-based resource or application. Preferably, the system is configured to filter a response from the server to the client so that at least one pit-determined item of language-sensitive content can be detected and replaced by the invention before it reaches the client. Thus, the system may be configured to identify an item of language-sensitive content (target item) and replace it with an aLternative (replacement) item which may be a translation of the target item, The system may comprise a storage structure e.g. database for storing the replacement items.
Preferably, each target item is associated with a corresponding replacement item, the target items and their respective replacement items may comprise natural language content.
The target andlor replacement item may comprise text and/or image data.
The replacement of a target item may be performed by a replacement algorithm. The replacement algorithm may comprise rules which govern the replacement of the target item. A different replacement algorithm may be provided for each target item. Regular Expressions (RegEx) may be used by the algorithm to identif5i which portions of the response content are to be replaced. The RegEx may be constructed to not only look for the content to be replaced (target) but also preceding and subsequent content (context) and location of the content in order to ensure replacement accuracy.
The system may be referred to as a language-switching system because it enables a user to change which natural language content is delivered in from a web application.
Preferably, the HTTP response is generated by an application executing on the web server.
Preferably, the content of the response is modified after it is generated by the application but before it is sent from the server to the client.
Preferably, the application is independent of the multilingual system. Thus, the application is preferably not part of the present invention per se, but the invention may interact with or use outpat generated by the application.
Preferably, the I-I'TTP response corresponds to a HTTP request which is received by the server from a client e.g. web browser. The HTTP request may comprise a request for a resource, such as a file.
One or more triggers may be used to determine whether a language switch is required. The one or more triggers may be prc-dctcrmincd, A trigger may comprise explicit and/or contextual information. The trigger may comprise information which is obtained or generated at the start of the session, or during the session.
The trigger may be an indication that a user wishes to receive a response from the server which is in a natural language other than the default or current language used by the system. Thus, the language switch may be a user initiated event which provides an instruction to change the language in which content is delivered to the client. Additionally or alternatively, the language switch may be initiated as a result of information obtained from the session.
Preferably, the system is arranged to: o load one or more configuration files (into memory) to control system operations and provide replacement rules and content; o respond to a set of events raised by the web server as a HTTP request and response arc processed (rcfcrrcd to as the Request/Response pipeline).
Preferably, the system is arranged to search and/or examine specified file types for one or more pre-deterniined targets. Preferably, when a match is found it performs a predetermined replacement and passes the result back into the response pipeline. The file type may be, for example, a HTML file, although other file types may be used. For example, the replacement may be in an XML, CSS or Javascript file, or a Flash (SWF) file or other file format.
Preferably, the response is sent to the client over the Internet, lhus, the invention provides a mechanism whereby tlT'i'P responses can be generated containing content in alternative natural languages. The language-modified responses can be created for an existing mono-linguistic web application running on a server. If the user (of (he web browser) or other sources of information indicate that the user needs to interact with the application in a language other than the default language, a Lranslated HYFP response may be returned by the system. Preferably, the translated I4TTP response includes modified content provided in a natural language other than the current or default session language.
Preferably, the configuration file comprises at least one rule specifying how and/or when the HTTP response is to be modified. The configuration file may be an XML file.
Preferably, the system comprises a plurality of configuration files. Each configuration file in the plurality may relate to a different natural language.
Modification of the HTTP response may be performed using a filter mechanism. [lie filter mechanism may be a HTTP module. The filter may intercept the HTTP response and/or request in the pipeline. The Response object, created by software on the server during a http session, may comprise a Filter property that can be used to attach customised streams.
A Stream object may be created and the Filter property of the Response may be modified so that it refers to that Stream object. The Stream may, therefore, be associated with the Response. Modified (i.e. replacement) data may be written to the stream by code provided by the system of the invention. Thus, the H1TP output may pass through the filter before it is transmitted to the client.
The filter mechanism may be arranged and configured to detect the one or more predetermined target items in the output and replace each with their respective replacement items. Replacement data may be written into a buffer in chunks, and then flushed into a Stream associated with the Response object. Thus, the response filter enables outgoing content to be inspected and modified, the underlying application being unaware and unaffected by the post-processing activities of the response filter.
Preferably, at least one software component of the system is arranged such that it operates in the HTTP request/response pipeline which is established between the client and web server. Preferably, the system is arranged for execution on a server e.g. web server.
thus, thc invention serves to intercept HTTP responses and/or requests so that the information provided in the request can he accessed and analysed and the content of the JITIP response modified accordingly before being sent to the client.
Preferably, Request and Response objects are created by the server as a result of a HTTP request being received from the client, and modification of the HTTP response is performed by writing the replacement item to a stream associated with the Response object.
The HTTP response may be modified to include a language selection in accordance with one or more rules contained in the configuration file.
The system may be further arranged to operate in reverse proxy mode wherein a local URL entered at the client is mapped by the server to a remote site, such that the system serves as client to that site.
The system may be further arranged to generate or update a data set relating to user activity obtained from a HTTF request corresponding to the response. Thus, the system may be used to log user-based activity. Infonnation may be gathered from the header of a 1-FLIP request. The logged information may relate to the user's language switches or preferences, and/or it may relate to the content of the request or response.
The system may comprise a software component arranged to parse the content of a HTTP request and/or I-ITTP response. The software may provide a set of nodes. It may process each type of node to extract any language-sensitive content. It may export said language-sensitive content to an XML file. The skilled person will understand that different methods may be employed for embodiments wherein the content files are not html related such as CSS, txt content and so on. Other embodiments of the invention may include other types of parsing modules such as, for example, PDF files or Word documents.
This feature may be referred to as scraping' mode. The system may thus be operable in replacement mode' as described above and/or in scraping mode' The term web scraping' is a term known in the art to mean an automated process wherein data is gathered from the World Wide Web by a program. There are various known techniques which are used for scraping. The skilled person would understand that any suitable scraping approach or implementation maybe used in conjunction with the invention.
Preferably, the system further comprises a user interface component arranged to enable one or more authorised users to specify or select the one or more target items and corresponding replacements for a given language. The one or more target items and corresponding replacements may be selected from a list or set of items contained within a web page.
The user interface component may be referred to as a management component'. It may enable an authorised user, such as an administrator-level user, to control, edit or specify how and when the system operates. It may enable the authorised user to update the one or more configuration files.
Also according to the invention there is provided a method corresponding to the system described above. The method may include the step of providing a system as described above.
A method according to the invention may be a method of modiring the natural language content of a HTTP response delivered from a web server to a client, the method comprising the steps: specifying at least one target and replacement item; determining whether a natural language switch is required during a HTTP session; and if so: identifying at least onc target itcm in the HTTP response sent from the web server to the client during the HTF P session; and modifying the HTTP response to replace the at least one target item with a replacement item.
The at least one target and replacement item may be stored in non-volatile memory. They may be stored in a configuration file.
The target item and corresponding replacement item may comprise natural language content. The target item may comprise text and/or an image, for example.
The I-ITTP response may be generated by software executing on the web server. The software may be an application.
The method may further comprise the steps ofi creating Request and Response objects as a result of a HTTP request being received from the client; and rnodif5'ing the HTTP response by writing the replacement item to a stream associated with the Response object.
The IJTTP response may he modified to include a language selection in accordance with the configuration file.
The method may comprise the step of: parsing the content of a I-ITTP response to provide a set of nodes; processing each type of node to extract any language-sensitive content; and exporting said language-sensitive content to an XML file, The method may comprise the step of: providing a user interface component arranged to enable one or more authorised users to specify or select the one or more target items and corresponding replacements for a given language.
A trigger may be used to deteimine whether a language switch is required, preferably wliereui the trigger conipnes inIoi-mation which is obtained or generated at the start of the session, or during the session.
Thus, in one sense, the invention may be viewed as a means for controlling and/or modifying the content of' a HTTP response. Additionally or alLernatively, it may be viewed as a means for controlling the content which is received and displayed by a web browser, andJor influencing the manner in which the user interacts with a web browser.
Additionally or alternatively, it may be viewed as a multilingual software tool which sits in the HTTP request/response pipeline. It intercepts messages communicated between the user's machine and the scrver, and then detects and replaces selected portions or resources in accordance with a set of pre-determined rules. Thus, the response provided to the user contains content which has been modified. The modification may be (natural) language-related. This translation process is seamless and invisible to the user and also the application. The invention may also he viewed as a natural language converter which -Il-enables customised content to be delivered to the client, tailored according to their natural language requirements or preferences.
The invention also provides an extensible solution in that additional natural language capabilities can be added to the system by including further configuration files. This can be achieved without affecting or impacting the operation of an underlying server application.
These and other aspects of the present invention will be apparent from and elucidated with reference to, the embodiment described herein.
An embodiment of the present invention will now be described, by way of example only, and with reference to the accompany drawings, in which: Figure 1 illustrates an overview of the main components and events relating to the IE of an embodiment of the invention.
Figure 2 shows an overview of the niairi coniporients of [lie system architecture in accordance with an embodiment of the invention.
Figures 3a and 3b show known techniques for delivering data from a web server via the use of Streams.
The following embodiment is described in relation to an illustrative implementation of the present invention. In this example, the invention is described as being implemented using ASP.net on an ITS platform. However, the skilled person will appreciate that the invention may be implemented in a variety of ways and using a variety of deployment technologies and platforms. Thc invention is not limited in respect of the hardware and/or software technologies with which, or upon which, it is implemented.
A core aspect of the present invention is that it has the ability to modify content returned from a web server to a client during a HTTP session. This may he implemented using a response filter i.e. a customiscd Stream. To create the response filter, a class is used which extends a Stream class (e.g. the MemoryStream class) and so inherits the base functionality defined for all instances of the superclass.
A Stream is an objcct which functions as a buffer (temporary storage area). Data can be inserted into a Stream via a call to its write method and accesscd via its read method.
Figures 3a and 31, show known techniques for using Streams to produce output during a HTTP session. In figure 3a, a simple content delivery is shown wherein the page being rendered 15 sends its data to a writer object (HttpWriter) 16. The writer object 16 buffers the data into chunks and then writes it irilo a stream 17, The data is then read from the stream 17 and sent to the client 2.
Multiple Streams can be attached to each other in a chain formation such that the output of one Stream feeds into another. A Response filter provides a mechanism for chaining a custom stream with the stream used by the writer object 16 as shown in Figure 3b. The write() methods of the Response object (httpRcspousc) make calls to a writer (1-lttpWriter) object 16. The writer object 16 buffers the data into chunks, which are writtcn to the custom Stream 18. The custom stream 18 can inspeel and modify the output before it is written to the stream 17 that returns the content to the requesting client.
The Response object, created by the server during a http session, has a Filter property (i.e. variable) that can be used to chain customised streams together so that output can be tailored according to requirements. 1'o do this, a Stream object is created and the Filter property of the Response is modified so that it refers to that Stream object. Thus, all HTTP output sent by the Write method passes through the filter, According to the present invention, the filter mechanism is arranged and configured to detect predetermined target items in the output and replace them with replacement items.
The data is written into a buffer in chunks, and then flushed into the stream for subsequent transmission to the client. Thus, by using a filter to inspect and modify outgoing content the underlying application is unaware and unaffected by the post-processing activities of the invention. Substitution of the target items with the replacement items provides the language switching functionality of the invention.
In an exemplary embodiment, the present invention comprises two scparate software applications which interface in an offline manner via XML configuration files: 1. The intercept Engine (IE) is the software component which provides the core frmnctionality of the language conversion system; 2. The Management Component (MC) provides a management tool for controlling, speci'ing and configuring the behaviour of one or more IE components and enabling adminisiration functionality. The MC provides a cenfralised resource (application or app') for managing multiple lBs across a number of applications 3' The Intercept Engine (IE) can be implemented as a lightweight, optimised (performance and memory) HTTP module that integrates with a web server 1 such as Microsoft IlS. In such an embodiment, it is installed for execution on the web server (or potentially as a Reverse Proxy on a remote server) and sits in the real-time request/response pipeline between the browser 2 and server I The IF. perfon'ns the following flmctions: o Loads the configuration files to control system operations and provide replacement rules and content; o Responds to a set of events raised by the web server 1 as a HTTP request 4 and response 5 are processed (referred to as the requestiresponse Pipelines); o When in replacement' mode (the typical operation mode), it searches the specified file types (e.g. html) for matches, When a match is found, it performs the defined replacement and passes the result back into response pipeline; o When in scraping' mode, the IF. parses the content into a set of node types (typically html nodes, but other content types may be used). It then processes each type of node to extract any language-sensitive content and exports this information to a matches' XML tile; o Inserts a language selector into the outgoing html (subject to configuration ruics); o Manages the session state in order to track the current user language, and any language switches; o Can operate in a reverse proxy' mode where a local URL (entered at the browser) is mapped internally to an external site; in such cases, the IE acts as a client to that site, receives the HTTP stream and performs the standard replacement or scraping thnctions, o Logs a configurable set of user activity. This may include information gathered from the HTTP header and language switching/use, but can also he enhanced to extract and store information from message content.
A key aspect of the LE is its ability to detect when a language switch is required. There are various potential triggers which can be detected by the system, resulting in a language switch being applied to the outgoing HTTP response 5. Explicit and contextual information can be used to identify the required language and trigger the switch (if the required language is different from the current language), Such infonnation can be generated at various times during client-server interaction process, arid may include: At the start of a session: o the language last used by the user in a previous session (i.e. stored in a cookie) o the preferred language stored in a user profile record if the user can be identified; this can happen at any time during a session if they login or otherwise identify themselves o the url (or shortcut) used to access the application, if that denotes a preferred language; this can be any part of the un, including the querystring o the referring URL * During a session: o an explicit request from the user -i.e. selecting a language from a language selector/choice in the user interface o content sensitivity -a specific part of' the content can trigger a language switch (e.g. the preferred language of a contact record in CRM) For each of these the behaviour is configurable: * Whether the method is active * The rules that apply to each method * The precedence/order for the methods The functionality of the JE can be viewed in terms of Pre & Post Processing activities, as follows: Pre (Request pipeline) processing: -Load config files (if not already loaded) -Respond to request pipeline events -Determine the current language for the session and whether a language switch has occurred (see above); update session settings accordingly -Log any user activity (as configured) -Depending on whether the lB is configured in replace or scrape mode, create the relevant Stream object to be used in the response pipeline -Identify whether lB is configured for reverse proxy mode; if so, terminate further request processing and transfer control directly to the response pipeline Post (Response pipeline) processing: -Check that the IE is enabled (config option); if not, no further processing is required -If running as a Reverse Proxy: o The IE will act as a client to the target server to obtain the remote URL (content) corresponding to the local request (handling any redirects) o Apply a set of configured reTite rules to the target content to convert all links and references to local references o Insert the modified target content into the local response pipeline so that further processing can continue as normal -If the Replace Stream object has been created: o If the current page is configured for it, and the content type is HTML insert a language selector (as per config) into the HTML at the designated (as per config) position o If the current language and page are set (by config) for replacements, ensure that latest set of match/replacement pairs for the language and page are loaded; then perform the replacements in the outgoing (response) stream -If the Scrape Stream object has been created: a Load any existing match configuration for the current page a Parse the content stream to identify content that is language sensitive'. The parsing method and algorithms for identifying content depend on the type of the content stream (html, XML, CSS, text, etc) o For each item of content extracted, cheekwhether it has previously been scraped (i.e. in the match configuration set), if not add ii o When the content stream has been processed, write the match data set back out to the config file In an illustrative ASP.net implementation, the. IE executes on the web server 1 and runs within the server process. A customised response.write method is used to write the desired output to the Stream object referenced by the Response Filter property.
Various events can be raised when a IIflP request 4 is processed. Some events are raised before the request is processed: * BeginRequest 6. This event signals a new request; it is guaranteed to be raised on each request.
* AuthenticateRequest 7 This event signals that the configured authentication mechanism has authenticated the request.
* AcquireRequestState. this event signals that individual request state should be obtained.
Other events are raised after the request is processed: * PostRequestHandlerExecute. This event signals that the I-ITTP handler has finished processing the request.
* EndRequest. This event signals that all processing has finished for the request. This is the last event called when the application ends.
In an illustrative implementation of the present invention, the IE consists of four separate sub-modules that are each triggered for execution by an event raised by the web server as it processes a request/response pair.
Tn addition the IF handles the modification of content which is to be provided as the response to the client, as described above.
Though each TB sub-module runs independently from the other, they form part of a single web server thread whilst processing a request/response through to completion. Also, a thread remains active and processes subsequern requests.
Therefore, memory context is retained for the duration of single request, enabling the sub-modules to pass information via shared variables. Also, the Request, Response and Context objects of the web server can be used for the same purposes for relevant data and information to share with/influence the server.
As the lifetime of the host thread for the IE extends beyond a single request, the lB can also carry its corfiguration data from one request to another, hence improving perfonnanee through reducing loads from disk.
The server events that the IF responds to are: -BeginRequest -AuthenticateRequest -PostAequireRequeststate 8 -EndRequest Between PostAcquireRequestState 8 and EndRequest, the server's Response.Write 9 method is called which invokes IE activity (not dissimilar to an event handler).
Referring to figure 1, the meaning of each event/method and the TB operation br each of these is: 1.1 BeginReguest 6 This occurs as the first event in the FITTP pipeline, i.e. when the web server I first receives a request from a client (browser) 2.
The IE functionality at this stage is as follows: * Check to see if this is an ASP form submission (i.e. POST); if it is, the server platfonu will not be able to handle this incoming request dircctly. The solution is to package the data in the request into a shared object and represent the request to the web server in a more acceptable format.
* Check to see if each of the config files has previously been loaded. If not, or if the timestamp of any of the contig files differs from when they were last loaded, load the file content into the thread memory.
1.2 AuthenticateRettuest 7 This event occurs when the web server 1 has established the identity of the user. i.e. when the server I is satisfied that this is a valid request which will be processed further.
The JE functionality at this stage is: * Check to see if the IE is configured to run as a Reverse Proxy. If so: o Obtain the local path used and store in a shared variable (used later) a Rewrite the requested page to a test page in the root of the site -i.e. allow the server to work as normal, just divert it to a null/inert page (see PostAcquireRequestState -this page will never be retrieved by the server).
[he main purpose of this is for handling the Reverse Proxy config where the request is stubbed' so that the web server is fooled irto a fairly passive action until the rcsponsc is manipulated later.
t,3 PostAcguireRepuestState 8 This event occurs when the request state (e.g. session state) thai. is associated with the current request has been obtained. Thus, the request 4 has been fully formed within the server, which is ready to retrieve the content from the underlying site/application. It is also where any filters' that are to handle the response from the site are to be defined.
The IF functionality at this stage is: * Check that the IF is enabled (config option), if not bypass all further operations below and return control to the server If configured to run in Reverse Proxy mode: o Translate the local un into the target un; o Send a http request 11 to the target site 10, i.e. TE acts as a client/browser to get the rionnal littp response. This includes the headers, cookies and other content from the local request (this needs Lo be maintained seamlessly through the normal and proxy round trips); o Get the response from the target site 10 o Copy the content of the target response to the local response object o If the response is a redirect, transform the remote url in the redirect into a local un. This is necessary so that the local browser is involved in the redirect request and therefore updates/maintains the correct state; o Otherwise, continue * Check to see if language tracking has been set for the user/browser session, if so get it, if not set it to a default.
* Check to see if a language switch occurred, if so update the language setting for this request and also in the user/browser session state.
* Check that the page is not on the excluded pages list, if not: o If this is an asp.net foirn (see handling of this in BeginRequest), then retrieve the form data package o If TB is configured for Replacement mode, set up a Response Filter to perform a replacement (i.e. attach a replacement Stream by creating a stream and binding the Response Filter property to that stream) o If IE is configured for Scraping mode, set up a Response Filter to perform a scrape (i.e. create and attach a stream to the Response Filter property.) * If configured to run in Reverse Proxy mode (i.e. there is little else for the local web server to do); a Check to see if the file formal/structure of the http response from the target is one which the IE is set up to parse/process (e.g. a text file containing html, etc); if it is, perfbrm a seL of replacements contro1led by config parameters) to transform target uris into local uris.
o Write the http response content from the target to the local response, i.e. do a response.write which triggers the overridden method (as described below) o Direct the server to complete the request (i.e. trigger the EndRequest event), as any further processing is redundant and a resource waste.
1.4 Response.Write 9 A series of Write operations are performed on the stream, followed by a Flush. The write operations are performed by methods of the present invention, which override the write methods which would otherwise be invoked by Response.Write.
In all embodiments of the invention, the resulting functionality is that a portion of euslomised code is invoked. This eustorrused code writes the desired changes (language replacements) to the Stream object that has been associated with the response so that when the response is sent back to the client 2 the content has been modified.
Depending on whether the JR is running in Reverse Proxy mode or not, the source of the http content will vary: -If in Reverse Proxy mode, it will contain the content written to the response stream at the end of the PostAcquireRequestState and will have been invoked by the Response.Write there.
-Otherwise, it will have been called from the web server 1 in the normal processing of retrieving a page/app output as part of a web request.
In both cases, operation is the same as the Response object 5 will contain the http content/stream to be sent back to the browser 2, and providing the page is not on the excluded list, the response filters will have been set up for replacement or scraping.
The JE functionality at this stage is: * Check if the content is parseable (i.e. text fonnat); if not, it is in binary and cannot be parsed; the IF simply leaves the response content as it is, withoLit alteration; * Dc-chunk the content into a single data structure: the web server will typically write large content in multiple chunks, as mentioned above. For the IF this simply means appending each chunk to an internal buffer until all chunks have been written.
* Check to see if the language is the default/native language for thc site, if so and if not configured to process native pages, do nothing, otherwise either do replacements or scraping: * Replacements: a Check to see if the replacement config has been loaded already. If it has not, or if the file timestamp is newer than the previous load, load the file into memory; a Using the replacements defined for the current page: * If thc replacement config contains an optimised RegEx expression, use that otherwise construct the default Reghx around the match string; * Find all strings in the target content that match the RegEx. For each, replace the matched content with the replacement string and ensure html and character encodings are correct(ed) * Scraping: o Cheek to see if scraping output files (matches and pages) already exist, if so load them into the matches and pages data structures o It the content to be scraped is html, process it using the html parser to produce an internal html document' (the html broken down into a data structure with nodes for each tag in the html); o Walk through the html document structure to visit each node: Depending on the node type, extract the attributes that are known to contain language related content * Some node types will require recursion.
* Where relevant content is identified for the node cheek if it is in the matches data structure.
* If ills there, hut the tinlestamp exceeds tile configured serape refresh duration, remove it; S * If it isn't there (or has been removed due to refresh expiry) write the page, node tag, attribute tag, content and scrape tirnestamp into the matches data structure if it is not already there; o Cheek that the page information is in the pages data structure if not, or if' expired, (re)write the page information; o Once all processing has been completed, write the memory structures for the matches and pages to the output xml files * If the content is html, check that the config options are not set to prevent language selector insertion and then: o Use the language selector html template from the config file o Use the config rules for the current language (i.e. hide, highlight, ete) o Insert the Lids constructed to reload the current page with the selected language in the querystring o Inject the selector html into the html string using the configuration options (based on a target html string start and heforelafterlreplace instructions) * If a html file, make sure the encoding is UTF8 (always good proactive and ensures any non US ASCII characters are preserved) * Write the modified content stream back into the http response Replacements can also be defined as leading' or trailing', allowing a user to split strings.
Certain embodiments may utilise parameterised matches.
Regular Expressions (Reghx) are a key aspect of the algorithm used to identi' which parts of the content are to be replaced. The RegEx is constructed to not only look for the content to be replaced but also preceding and subsequent content (context) and location of the content in order to ensure replacement accuracy. Further, differing RegEx structures can result in significant performance differences. Therefore it is important that optimised RegEx expressions arc using for each replacement to meet both accuracy and performance requirements. Given that this can be a complex process, RegEx expressions are built in the MC wherever possible and passed to the IF in the config files.
1.5 EndRegnest 12 This is the last event in the http pipeline before the server 1 returns the response 5 to the client 2. The IF functionality at this stage is: * If the http response contains a redirect, process this ensuring that the language context/selection is prescrvcd * Check that other http parameters are correctly formed, specifically around caching, content length (which will have been revised), etc. The Management Component (MC') This system component does not font part of the real-time operation of the system.
Instead, it is used (typically) on a terminal (e.g. PC) which is separate to the web server and is used by a limited number of authorised administration users. It consumes and produces the XML files produced/used by the IF, provides the user interface to manage content, translations and configuration and can manage multiple IE implementations across an organisation, It incorporates and/or interacts with a database, such as a SQL Sower.
The second system component, the Management Component (MC) 13 allows authorised users 14 e.g. administrators to perform various managerial tasks, and provides a control mechanism for configuring arid adapting the behaviour of the IE. In an illustrative embodiment it comprises the following functions: o issues the license key to the lEs o Imports the following XML files from the IF The Matches file produced during scraping * The Pages file produced during scraping * Event, trace and error log files * Activity logging files o Exports the following XML filcs to the IE: * Configuration settings to control lE operation, including the license key * The Replacements file that contains matches, replacements and associated rules * The Pages tile that defines how each page in the target application is to be processed Template content. For complex replacements (e.g. the language selector), contains the template content and completion rules o Organiscs imported (from the IB) matches to catcgorise them for further management, remove them as targets and promote/relegate common matches to global/local stalus; o Local translation allows replacements (translations) to be entered and for the status of the translation to be tracked; o External translation provides an export to an external translator and interfaces to machine translation (Bing & Google as standard), translation memory, translation workflow and other external systems; o Language selection --selecting the set of languages to be supported for each application; o Page configuration -for each page in the application, the MC defines how that page is to he handled (whether to translate, insert a selector, redirect to a different page, ete); o Scraping -provides a split user interface that can interactively control the scraping function of the IE whilst viewing the content stream allowing the scope of identified matches to be controlled; o Application configuration -provides a User Interface to manage all configuration settings for the MC and IE(s), globally and/or on a per-application basis, The present invention provides significant cost savings in comparison to prior art approaches, including: * significantly lower deployment costs relative to software renovation (localisation) * lower operational costs than the multi-application approach * costs for future UI content changes, languages become trivial * the invention enables unification of applications across countries/regions * the invenLion provides potential benefits beyond language capabilities it enables internal development -this simplification yields net savings * the end user can switch languages easily, simply and quickly with the click of a buff on * the organisation has only one version of the application to maintain and concerns regarding different languages become less of a problem for the developers and organisation alike * reduced effori re implementation and ongoing maintenance -the invention provides a low impact' solution.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be capable of designing many alternative embodiments without departing from the scope of the invention as defined by the appended claims. In the claims, any reference signs placed in parentheses shall not be construed as limiting the claims. The word "comprising" and comprises", and the like, 1 5 does not exclude the presence of elements or steps other than those listed in any claim or the specification as a whole. In the present specification, "comprises" means "includes or consists of' and "comprising" means "including or consisting of". The singular reference of an element does not exclude the plural reference of such elements and vice-versa. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In a device claim enumerating several means, several of these means may be embodied by one and the same item of hardware.
The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.

Claims (16)

  1. CLAIMS: 1. A web-based multilingual system comprising at least one software component arranged to: determine whether a natural language switch is required during a HTTP session; and if so: identify at least one target item in a HTTP response sent from a web server to a client during the H'Il'P session; and modify the I-ITTP response to replace the at least one target item with a replacement item; wherein the target and replacement items are specified in a configuration file.
  2. 2. A system according to claim 1 wherein the HTTP response is generated by software executing on the web server.
  3. 3. A system according to claim 1 or 2 wherein at least one software component of the system is arranged such that it operates in the request/response pipeline which is established between the client and web server.
  4. 4. A system according to any preceding claim wherein Request and Response objects are created by the server as a result of a IITTP request being received from the client, and modification of the IITTP response is performed by writing the replacement item to a stream associated with the Response object.
  5. 5. A system according to any preceding claim wherein the HTTP response is modified to include a language selection in accordance with the configuration file.
  6. 6, A system according to any preceding claim wherein the system is further ananged to operate in reverse proxy mode wherein a local URL. entered at the client is mapped by the server to a remote site, such that the system serves as client to that site.
  7. 7. A system according to any preceding claim wherein the system comprises at least one ffirther software component arranged to: parse the content of a HTTP response to provide a set of nodes; process each type of node to extract any language-sensitive content; and export said language-sensilive conlent to an XML file.
  8. 8. A system according to any preceding claim wherein the system thrther comprises a user interface component arranged to enable one or more authorised users to specify or sclcct thc one or more target items and corresponding replacements for a given language.
  9. 9. A system according to any preceding claim wherein a trigger is used to determine whether a language switch is required, preferably wherein the trigger comprises information which is obtained or generated at the start of the session, or during the session.
  10. 10. A method of modifying the natural language content of a FI1TP response delivered from a web scrvcr to a clicnt, thc mcthod comprising the steps: specifying at least one target and replacement item in a configuration file; determining whether a natural language switch is required during a HTTP session; and if so: identit'ing at least one target item in the HTTP response sent from the web server to the client during the HTTP session; and modifying the HTTP response to replace the at least one target item with a replacement item.
  11. II. A method according to claim 10 wherein the HTTP response is generated by software executing on the web server.
  12. 12. A method according to claim 10 or 11 ifirther comprising the steps: creating Request and Response objects as a result of a HTTP request being received from the client; and modifying the J4TTP response by writing the replacement item to a stream associated with the Response object.
  13. 13. A method according to claims 10 to 12 wherein the HTTP response is modified to include a language selection in accordance with the configuration file,
  14. 14. A method according to claims 10 to 13 comprising the step of: parsing the content of a HTTP response to provide a set of nodes; processing each type of node to extract any language-sensitive content; and exporting said language-sensitive content to an XML file.
  15. 15. A method according to claims 10 to 14 and comprising the step of: providing a user interface component arranged to enable one or more authorised users to specify or select the one or more target items and corresponding replacements for a given language.
  16. 16. A method according to claims 10 to 15 wherein a trigger is used to determine whether a language switch is required, preferably wherein the trigger comprises information which is obtained or generated at the start of the session, or during the session.
GB1405190.8A 2014-03-24 2014-03-24 Multilingual system and corresponding method Withdrawn GB2524491A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
GB1405190.8A GB2524491A (en) 2014-03-24 2014-03-24 Multilingual system and corresponding method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB1405190.8A GB2524491A (en) 2014-03-24 2014-03-24 Multilingual system and corresponding method

Publications (2)

Publication Number Publication Date
GB201405190D0 GB201405190D0 (en) 2014-05-07
GB2524491A true GB2524491A (en) 2015-09-30

Family

ID=50686762

Family Applications (1)

Application Number Title Priority Date Filing Date
GB1405190.8A Withdrawn GB2524491A (en) 2014-03-24 2014-03-24 Multilingual system and corresponding method

Country Status (1)

Country Link
GB (1) GB2524491A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112306620A (en) * 2020-12-24 2021-02-02 深圳市蓝凌软件股份有限公司 Multi-language loading method and device for user-defined form control

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113962213A (en) * 2021-10-27 2022-01-21 深圳康佳电子科技有限公司 Multi-turn dialog generation method, terminal and computer readable storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6415249B1 (en) * 2000-03-01 2002-07-02 International Business Machines Corporation Method and system for using machine translation with content language specification
WO2005033827A2 (en) * 2003-10-02 2005-04-14 Netmask (El-Mar) Internet Technologies Ltd. Configuration setting
US20060218133A1 (en) * 2005-03-24 2006-09-28 Atkin Steven E Constructing dynamic multilingual pages in a Web portal
WO2006132676A1 (en) * 2005-06-02 2006-12-14 Oracle International Corporation Globalization framework for providing locale-specific services using client-side scripting languages
US20070043818A1 (en) * 2000-02-09 2007-02-22 Microsoft Corporation Creation and delivery of customized content
US20120023160A1 (en) * 2003-10-02 2012-01-26 Netmask (El-Mar) Internet Technologies Ltd. Dynamic content conversion

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070043818A1 (en) * 2000-02-09 2007-02-22 Microsoft Corporation Creation and delivery of customized content
US6415249B1 (en) * 2000-03-01 2002-07-02 International Business Machines Corporation Method and system for using machine translation with content language specification
WO2005033827A2 (en) * 2003-10-02 2005-04-14 Netmask (El-Mar) Internet Technologies Ltd. Configuration setting
US20120023160A1 (en) * 2003-10-02 2012-01-26 Netmask (El-Mar) Internet Technologies Ltd. Dynamic content conversion
US20060218133A1 (en) * 2005-03-24 2006-09-28 Atkin Steven E Constructing dynamic multilingual pages in a Web portal
WO2006132676A1 (en) * 2005-06-02 2006-12-14 Oracle International Corporation Globalization framework for providing locale-specific services using client-side scripting languages

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112306620A (en) * 2020-12-24 2021-02-02 深圳市蓝凌软件股份有限公司 Multi-language loading method and device for user-defined form control

Also Published As

Publication number Publication date
GB201405190D0 (en) 2014-05-07

Similar Documents

Publication Publication Date Title
US11621924B2 (en) Incorporating web applications into web pages at the network level
CN111475757B (en) Page updating method and device
JP6092249B2 (en) Virtual channel for embedded process communication
US7979450B2 (en) Instance management of code in a database
US20150161277A1 (en) Methods and systems for one browser version to use a rendering engine of another browser version for displaying information
US20110029854A1 (en) Web content management
KR20070015440A (en) Methods and systems for dynamically composing distributed interactive applications from high-level programming languages
KR101991537B1 (en) Autonomous network streaming
US20100070972A1 (en) Apparatus, method, and computer program product for processing information
US20150089408A1 (en) Method and framework for content viewer integrations
US20100229081A1 (en) Method for Providing a Navigation Element in an Application
US9720688B1 (en) Extensible change set conflict and merge gap detection
US8171045B2 (en) Record based code structure
US20210133270A1 (en) Referencing multiple uniform resource locators with cognitive hyperlinks
US10956239B1 (en) Utilizing source context and classification in a copy operation
US20140365877A1 (en) File History Tagging
GB2524491A (en) Multilingual system and corresponding method
Griggs Node Cookbook: Discover solutions, techniques, and best practices for server-side web development with Node. js 14
JP5393242B2 (en) Data providing method and intermediate server device
Bojinov RESTful Web API Design with Node. js
US8943483B2 (en) Application programming interface naming using collaborative services
US9696887B2 (en) Integrated user interface using linked data
US20210342130A1 (en) Systems and methods for software application generation and delivery
Raman et al. Building RESTful Web Services with Spring 5: Leverage the Power of Spring 5.0, Java SE 9, and Spring Boot 2.0
Álvarez‐Sabucedo et al. Reusing web contents: a DOM approach

Legal Events

Date Code Title Description
WAP Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1)