CN109992796B

CN109992796B - Mercube machine translation management control system and method and computer program

Info

Publication number: CN109992796B
Application number: CN201910131256.9A
Authority: CN
Inventors: 米艳杰; 陶为民; 程国艮
Original assignee: Glabal Tone Communication Technology Co ltd
Current assignee: Glabal Tone Communication Technology Co ltd
Priority date: 2019-02-22
Filing date: 2019-02-22
Publication date: 2023-07-04
Anticipated expiration: 2039-02-22
Also published as: CN109992796A

Abstract

The invention belongs to the technical field of natural language processing or conversion, and discloses a Mercube machine translation management control system and method and a computer program; comprising the following steps: an application layer and a platform layer; the application layer comprises: the system comprises an integrated management module, an upgrade management module and a monitoring management module; the platform layer comprises: the device comprises a device management module, a container management module, a machine turning interface management module, a machine turning capacity management module and a copyright management module. MerCube provides higher bandwidth and more links, which can promote the scalability of multiple GPUs and multiple GPU/CPU system configurations. The invention adopts an automatic script mode to install and configure. The system really realizes one-key installation, one-key upgrading and downloading, one-key start-stop service and one-key cutting, and achieves the purposes of easy deployment and convenient remote maintenance. The deployment of the back-end service adopts a popular Docker container technology, so that the system is stable in operation and simple and convenient in management.

Description

Mercube machine translation management control system and method and computer program

Technical Field

The invention belongs to the technical field of natural language processing or conversion, and particularly relates to a Mercube machine translation management control system and method and a computer program.

Background

Currently, the current state of the art commonly used in the industry is as follows: machine translation is a process of learning a correlation between two languages from bilingual parallel data using a machine learning algorithm, and then converting one natural language into another natural language using the learned rules. The development of machine translation technology has been closely accompanied by the development of disciplines such as computer technology, information theory, linguistics, and the like. Machine translation has undergone an open time, a frustrated time, a convalescence, a new time since the thirty-year generation of the twentieth century. The new period began in 1990 and underwent rule-based machine translation, statistical-based machine translation, and neural network-based machine translation. At present, the machine translation based on the neural network is dominant, and compared with the statistical machine translation, the machine translation adopts an end-to-end translation mode. The main idea of machine translation based on neural network is to adopt a structure of 'coding and decoding', and the coding and decoding modules respectively use a cyclic neural network for training and translation. For a sentence to be translated, the encoder first converts the sentence into a vector of a fixed dimension, then takes the vector as input, the decoder gives a sequence of word vectors, and finally converts the output sequence of word vectors into target language words by means of dictionary lookup.

In the existing mainstream machine translation, no matter a machine translation system based on statistics or a machine translation system based on a neural network, a GPU can only be provided with a natural language model, and the structure and the function are relatively single.

In summary, the problems of the prior art are: in the existing mainstream machine translation, no matter a machine translation system based on statistics or a machine translation system based on a neural network, a GPU can only be provided with a natural language model, and the structure and the function are relatively single.

Disclosure of Invention

Aiming at the problems existing in the prior art, the invention provides a Mercube machine translation management control system and method and a computer program.

The invention is realized in such a way that a Mercube machine translation management control system comprises: an application layer and a platform layer;

the application layer comprises: the system comprises an integrated management module, an upgrade management module and a monitoring management module;

the platform layer comprises: the device comprises a device management module, a container management module, a machine turning interface management module, a machine turning capacity management module and a copyright management module.

Further, the integrated management module includes:

The system comprises a document processing integrated unit, a LangBox integrated unit and a user third party system integrated unit;

an integrated unit with document: the method is used for responding to a call request of a document processing system in an HTTP Post/Get mode and returning a translation result;

integrated with LangBox unit: responding to the call request of the LangageBox in an HTTP Post/Get mode, and returning a translation result;

integration unit with user third party system: and responding to the call request of the user third party system by using an HTTP Post/Get mode or using the docx/execl file as a medium, and returning a translation result.

Further, the upgrade management module includes:

comprising the following steps: the system comprises an uploading management unit, a downloading management unit and a script management unit;

and an uploading management unit: a file, a data model, a configuration file, a program, a script, etc. for uploading the upgrade; wherein the script file is uploaded each time and must be contained; after the uploading is finished, the Mercube automatically executes the script to finish the upgrading of the back-end system;

download management unit: for downloading any file on the log and MerCube system;

script management unit: the method is used for uniformly managing all the uploaded and downloaded scripts and cleaning up the expired scripts regularly.

Further, the monitoring management module includes:

the system management unit, the call cost statistics unit, the caller management unit, the call history statistics unit, the caller authority management unit, the equipment use monitoring unit and the caller pricing management unit;

system management unit: the system manager for managing the Mercube system comprises the steps of modifying, storing and setting system parameters such as names and passwords;

calling a fee counting unit: the method comprises the steps of performing cost statistics on translation requests performed by a calling user in a period of time, and summarizing the cost statistics in a table form; the language pairs, the translated word numbers and the requested word numbers can be counted by taking hours, days and months as units;

caller management unit: the method is used for performing addition, deletion, modification and check operations on the calling user;

call history statistics unit: the method is used for carrying out history tracking on translation requests carried out by a calling user within a period of time and displaying the translation requests in various forms such as a line graph, bar2d and the like;

caller authority management unit: the method is used for controlling whether the scheduling user can translate the request or not, and comprises a controlled time period and a controlled language pair direction;

device usage monitoring unit: a method for detecting device usage of a GPU card in real time, comprising: the busy and idle occupancy rate of the GPU in the past 1/4/8/24 hours; the number of requests for language pairs; average (translation) request completion time length, etc.; and is displayed in a multi-form such as a line diagram, bar2d and the like;

Caller pricing management unit: for performing billing control when a call subscriber can make a translation request, and for setting a unit price (different billing unit prices can be set for different language pairs in the future).

Further, the device management module includes:

the GPU equipment distribution management unit and the GPU development environment management unit;

GPU equipment distribution management unit: the method is used for carrying out static allocation on the GPU card and what language model is operated; or when the system is in normal operation, temporarily stopping a certain GPU card and distributing the GPU card;

GPU development environment management unit: the method is used for upgrading and testing management of development environments, driving modules and the like of the GPU card.

Further, the container management module includes:

a machine turning scheduling container unit and a machine turning engine container unit;

and (5) mechanically turning the scheduling container unit: the method is used for one-key start-stop and management scheduling containers, wherein Nginx (reverse proxy, static Web server), HAproxy (high-availability load balancing proxy server), redis (memory database, accelerated access), mysql (database), uWSGI (Web server, support Python flash framework) are installed in the containers;

mechanically turning the engine container unit: the system is used for one-key start-stop and management of the machine translation engine container; the container is internally provided with a mechanically turned core program and a mechanically turned module; deep learning the trained language model and configuration file.

Further, the machine turns over interface management module and includes:

the method comprises the steps of calling a permission management unit, calling a queue grading management unit, an optimization strategy management unit and a flow control management unit;

invoking a rights management unit: the permission setting module is used for setting permission of the calling user and allowing or prohibiting the user to use the turning system in a certain period of time; meanwhile, the use of language pairs can be defined; the use of the user can be dynamically regulated by combining the current limiting measures;

call queue hierarchical management unit: for Priority (Priority) level management of calling users; allowing the high-priority users to allocate resources preferentially to obtain responses; only when the system is idle, the user with low priority applies for the translation resource;

an optimization policy management unit: the method is used for carrying out unified policy management on the optimization measures and carrying out combined use on the optimization measures;

flow control management unit: when the translation task is heavy and the request quantity of the user is large, the current limiting measure is started to ensure the rights and interests of all users, and the large-request user is limited to access so as to ensure the balance of system call.

Further, the optimizing means includes:

(1) The scheduling of machine turning resources and the management of equipment of the GPU display card are combined, so that static allocation (the card equipment is fixed with the language model) and dynamic allocation (the card equipment is not fixed with the language model) can be realized;

(2) Caching the translation result by utilizing the characteristic of quick access of the redis memory database so as to be used directly next time;

(3) And combining a plurality of small requests of the user, submitting background translation at one time, and returning a result faster than single sentence submission.

Further, the current limiting control includes:

the current limiting mode can be divided into static control, dynamic control and hybrid control;

static control: for a certain user, a fixed flow control threshold parameter (Filter or criterion) is set in advance; such as: the number of requests is <10 times in 1 second; the number of requests within 1 minute is <100; within 1 hour, the number of translated words is not more than one million; when the user reaches the threshold value, limiting the use of the user for a period of time;

dynamic control: detecting the time of the background return when the translation interface is called each time, so as to judge the busy or idle of the background translation task; such as: the response time of the call must be less than a certain value (e.g., within 20 ms), beyond which the current is temporarily limited;

mixing control: simultaneously supporting dynamic control and static control; dynamic and static; and after the dynamic auditing is passed, auditing by using a static method.

Further, the current limiting control use includes:

the flow control threshold parameters can be combined and used by utilizing the optimizing strategy management unit; meanwhile, the current limiting measure can be used together with the authority management of the user, and the effect is good.

Further, the machine turning capability management module includes:

language pair management unit: the machine turning capacity carried by the Mercube system can be cut and customized; which language pairs are exposed for use by the user and which are temporarily disabled or hidden;

translation capability management unit: the method is used for manually adjusting a language model loaded by a GPU card installed on the Mercube system based on GPU equipment management so as to meet the requirement of a certain language on larger translation (such as zh in Yinen- >) in a certain time period on business.

Further, the copyright management module: adopting various measures to protect intellectual property of the system;

(1) The system collects key characteristics of the Mercube machine to ensure that the engine can only run on fixed hardware and is adapted to a fixed GPU card;

(2) The system embeds the codes of the core translation algorithm and the key feature acquisition algorithm into the dongle, and dynamically loads the codes when the system runs, so that the system is prevented from being copied and cracked.

Further, the MerCube machine translation management control method comprises the following steps:

step one, merging a plurality of data packets according to a merging request packet strategy;

step two, designing a routing rule and scheduling the data packet;

Step three, processing data including text content, putting the text content in an srcl_list after dividing sentences, setting a tgtl_list and storing the results of the corresponding sentences inquired in redis;

step four, circularly traversing the list srcl_list, taking a value, searching from redis after MD5, and storing the queried data in the tgtl_list;

combining the uncancelled sentences into a sentence, and continuously transmitting the sentence to NMT translation;

step six, traversing and MD5 the translated sentences, storing the translated sentences in redis, traversing and filling the translated sentences into a tgtl_list list to form a complete translation list;

and step seven, splicing the translation list into character strings and returning.

Further, in the first step, the merging request packet policy includes:

{

“srcl”:”nen”,

“tgtl”:”nzh”,

"text": [ { "APP_ID": "addsadsd", addr:11111, text: "content" }, { }, { } ]

}

Further, in the second step, the routing criteria include:

request level Priority is defined dynamically, stepless number is defined. The specific number of steps is determined and requested by the application party of the translation request; the request level is represented by a single linked list, and the request is represented by a queue; wherein, the N level is highest, and the 0 level is lowest;

(3) Routing in rule:

when a request call arrives: firstly, quickly searching a language pair (LangPair) linked list, searching a Priority linked list at the same level after finding, and queuing new requests at the tail of all queuing calls in the linked list; if the language pair is not found, creating one, and inserting the one at the tail of the language pair linked list; if the priority level is not found, creating one, arranging in a descending order, and inserting the one into a proper position in a priority level queue; after the calls enter the queuing queue, counting Content-Length values of all the calls under the priority level, accumulating the Content-Length values, and storing the accumulated Content-Length values in a priority level node;

(4) Routing out rules:

1) Language pair: no prioritization; sequentially checking all priority linked lists under the language pairs in sequence;

2) Priority level: prioritized; firstly, outputting an N-level linked list, after finishing outputting, outputting N-1 level, and so on until the last 0 level; under the same-level linked list, firstly, call_0 is output, then call_1 is output, and so on until call_n is finally output;

3) When a request call arrives: any request call arrives, the priority level of the language where the call is located to the node is checked, the stored Content-Length value is obtained from the highest N level, and whether the Content-Length value reaches the set value is judged: if so, starting to combine all call request packets to form a big packet and sending the big packet; if the set size is not reached, checking the N-1 level downwards in sequence, and continuously accumulating the Content-Length value of the queuing queue below the N-1 level until the last level 0; after the combined packets are sent out, the combined calls are emptied from the queuing queue at the same time;

4) The timing duration (timeout duration for waiting for the call) has arrived: sequentially checking all language pairs from the head to the head, and checking the processing procedure, which is the same as the processing procedure of the above [ when a request call arrives ];

5) At the end of the check, if the accumulated Content-Length value is smaller than the set value, the remaining queuing packets are merged and sent out; at the same time, queuing queues under all language pairs/priority levels are emptied.

Further, in the second step, the scheduling management includes:

(1) Receiving a request packet, storing the request packet into different request priority queues according to priority parameters, scanning data packets with 5ms intervals and packet length > =2000, and directly transmitting the data packets to a background for processing a non-merging request;

(2) Triggering a distribution mechanism: merging request packets from the highest-level queue and from the head of the queue until a set packet length is reached;

(3) Combining rules: when the scanning time interval is up, only the requests of the same language to the direction are merged, and the requests are submitted to the back end for processing no matter whether the packet length meets the requirement or not.

Another object of the present invention is to provide a MerCube machine translation management control method for operating the MerCube machine translation management control system, the MerCube machine translation management control method comprising:

another object of the present invention is to provide a computer program for implementing the MerCube machine translation management control method.

Another object of the present invention is to provide an information data processing terminal implementing the MerCube machine translation management control method.

It is another object of the present invention to provide a computer readable storage medium including instructions which, when executed on a computer, cause the computer to perform the MerCube machine translation management control method.

In summary, the invention has the advantages and positive effects that: the MerCube product incorporates a plurality of industry-leading machine translation system technologies including neural network machine translation technology, statistical machine translation technology, technical term translation technology, translation memory library technology and the like. The neural network machine translation technology mainly adopts an Attention-based machine translation model framework, and is a novel Attention-based machine translation structure from an encoder to a decoder; the data preprocessing and post-processing technology accords with the industrial application standard, and the translation accuracy is greatly improved on the premise of ensuring the processing speed.

The Mercube adopts the Injeopardy NVLink technology, provides higher bandwidth and more links, and can promote the expandability of the multi-GPU and multi-GPU/CPU system configuration. By utilizing the technology, the performance of the neural network translation system is improved, and finally, the translation at the whole chapter level can be accelerated.

The invention adopts an automatic script mode to install and configure. The system really realizes one-key installation, one-key upgrading and downloading, one-key start-stop service and one-key cutting, and achieves the purposes of easy deployment and convenient remote maintenance. The deployment of the back-end service adopts a popular Docker container technology, so that the system is stable in operation and simple and convenient in management.

And installing configuration by adopting an automatic script mode. The system really realizes one-key installation, one-key upgrading and downloading, one-key start-stop service and one-key cutting, and achieves the purposes of easy deployment and convenient remote maintenance.

The deployment of the back-end service adopts a popular Docker container technology, so that the system is stable in operation and simple and convenient in management.

In order to realize the optimal utilization of translation resources, the Mercury system adopts a series of technologies to develop a plurality of functional modules to carry out scientific scheduling and system tuning:

for the user request, carrying out multi-level queuing according to the priority, and preferentially responding to the request with high priority;

and carrying out flow control on frequent requests of users according to resource allocation conditions, limiting a large number of unreasonable instantaneous requests of partial users, and preventing unbalance of resources and call.

The system also provides complete user management and monitoring measures, can set the authority of the user, set the use time, set the use language pair, and can display the use condition of the user in real time, and the historical condition is accurately counted.

The universal json interface is used for facilitating easy integration of the MerCube system and other third party systems. For example, the system is in seamless butt joint with a document translation system, shares a back-end module and uses resources intensively.

The technical framework of the Mercube system typically adopts a development technology and a deployment scheme with separated front and back ends to separate views, data and structures, so that data streams, service streams and page streams are respectively operated, maintained and modified independently, and modular development is perfectly realized

The integrated core translation engine is matched with functional modules such as intelligent management (user management, equipment management and translation capability management), dynamic monitoring (running state and running capability real-time monitoring), charging (optional) and the like, so that the machine turning system really becomes an independent machine turning system capable of being easily issued and conveniently managed

Based on GPU equipment management, mercury allows a system administrator to dynamically adjust a loaded language model to meet the service dynamic management translation capability.

And adopting various encryption measures to protect intellectual property rights. The system provides soft encryption (a core algorithm encrypts and stores a hard disk, and dynamically loads memory for decryption when executing), and hard encryption (the core algorithm is embedded in a dongle) is used for protecting the system from copying and cracking.

Drawings

FIG. 1 is a schematic diagram of a Mercube machine translation management control system according to an embodiment of the present invention;

Fig. 2 is a flowchart of a MerCube machine translation management control method provided by an embodiment of the present invention.

Fig. 3 is a flow chart of a routing rule provided by an embodiment of the present invention.

Fig. 4 is a routing rule flow chart provided by an embodiment of the present invention.

Fig. 5 is a schematic diagram of a routing rule design according to an embodiment of the present invention.

FIG. 6 is a schematic diagram of dynamically changing GPU language pairs according to an embodiment of the present invention.

Fig. 7 is a functional architecture diagram of a MerCube machine translation management control system provided by an embodiment of the present invention.

Fig. 8 is a technical architecture diagram of a MerCube machine translation management control system provided by an embodiment of the present invention.

In the figure: 1. an integrated management module; 2. an upgrade management module; 3. a monitoring management module; 4. an equipment management module; 5. a container management module; 6. a machine turning interface management module; 7. the machine turnover capacity management module; 8. and a copyright management module.

Detailed Description

The present invention will be described in further detail with reference to the following examples in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

The MerCube system provides a method for loading two natural language models on one GPU card, and realizes the function of dynamically changing the language models. Moreover, the greatest problem of cloud translation is the safety problem of data, and when a user uses a cloud translation product, the cloud translation background can necessarily obtain plaintext information of a text to be translated, and protection of privacy of the user is not utilized. The advent of the MerCube system avoided such problems; the MerCube is a special machine translation server for users, and clients can only access in a local area network by adopting a localization deployment mode, so that the safety of translating original data of the users is ensured from the physical environment.

The principle of application of the invention is described in detail below with reference to the accompanying drawings.

As shown in fig. 1, the MerCube machine translation management control system provided by the embodiment of the present invention includes: an application layer and a platform layer.

The application layer comprises: the system comprises an integrated management module 1, an upgrade management module 2 and a monitoring management module 3.

The platform layer comprises: the device management module 4, the container management module 5, the machine turning interface management module 6, the machine turning capability management module 7 and the copyright management module 8.

The integrated management module 1 provided by the embodiment of the invention comprises:

The upgrade management module 2 provided by the embodiment of the invention comprises:

The monitoring management module 3 provided by the embodiment of the invention comprises:

The device management module 4 provided by the embodiment of the invention comprises:

The container management module 5 provided by the embodiment of the invention comprises:

The machine turning interface management module 6 provided by the embodiment of the invention comprises:

The optimization measures provided by the embodiment of the invention comprise:

The current limiting control provided by the embodiment of the invention comprises the following steps:

The current limiting control use provided by the embodiment of the invention comprises the following steps:

The machine turning capacity management module 7 provided by the embodiment of the invention comprises:

The copyright management module 8 provided by the embodiment of the invention: adopting various measures to protect intellectual property of the system;

As shown in fig. 2, the MerCube machine translation management control method provided by the embodiment of the invention includes the following steps:

s101, merging a plurality of data packets according to a merging request packet strategy;

s102, designing a routing rule and scheduling the data packet;

s103, processing data including text content, putting the text content in an srcl_list after dividing sentences, and setting a tgtl_list to store the results of the corresponding sentences queried in redis;

s104, circularly traversing the list srcl_list, taking a value, searching from redis after MD5, and storing the queried data in the tgtl_list;

S105, combining the uncancelled sentences into a sentence, and continuously transmitting the sentence to NMT translation;

s106, traversing and MD5 the translated sentences, storing the translated sentences in redis, traversing and filling the translated sentences into a tgtl_list list to form a complete translation list;

s107, splicing the translation list into character strings and returning.

In step S101, the merging request packet policy provided in the embodiment of the present invention includes:

{

“srcl”:”nen”,

“tgtl”:”nzh”,

"text": [ { "APP_ID": "addsadsd", addr:11111, text: "content" }, { }, { } ]

}

As shown in fig. 3 to 5, in step S102, the routing criteria provided in the embodiment of the present invention include:

(5) Routing in rule:

(6) Routing out rules:

In step S102, the scheduling management provided in the embodiment of the present invention includes:

The principle of application of the present invention will be described in detail with reference to specific embodiments.

1. Merging a plurality of request packets:

1. merging request packet policies

The merged packet json format is as follows:

{

“srcl”:”nen”,

“tgtl”:”nzh”,

"text": [ { "APP_ID": "addsadsd", addr:11111, text: "content" }, { }, { } ]

}

2. The returned merged packet json format is as follows:

{

"text": [ addr:11111, text: "Contents" }, { }, { } ",

“use_time”：”0.1234”

}

2. Routing criteria

The design of the priority of the translation request is based on the following routing criteria:

1. request grading and sequencing

(1) All translated requests are uniformly ordered according to priority, are increased from 0 to 0, and are the lowest in level 0;

(2) Requests of the same priority are queued first and then queued later in order of arrival time.

3. Design of routing criteria

For the above routing criteria, the following is designed:

1. request level Priority is defined dynamically, stepless number is defined. The specific number of steps is determined and requested by the application of the translation request.

2. The request level is denoted by "singly linked list" and the request by "queue". The following figures show.

Wherein, the N level is highest, and the 0 level is lowest.

Routing in rule:

when a request call arrives:

firstly, quickly searching a language pair (LangPair) linked list, searching a Priority linked list at the same level after finding, and queuing new requests at the tail of all queuing calls in the linked list;

if the language pair is not found, creating one, and inserting the one at the tail of the language pair linked list;

if the priority level is not found, creating one, arranging in a descending order, and inserting the one into a proper position in a priority level queue;

After the calls enter the queuing queue, the Content-Length values of all the calls under the priority level need to be counted, accumulated and then stored in the priority level node.

Routing out rules:

language pair:

there is no prioritization.

Sequentially checking all priority linked lists under the language pairs in sequence;

priority level:

with a priority order.

Firstly, outputting an N-level linked list, after finishing outputting, outputting N-1 level, and so on until the last 0 level;

under the same-level linked list, call_0 is output first, call_1 is output again, and so on until call_n is finally output

When a request call arrives:

any request call arrives, the priority level of the language where the call is located to the node is checked, the stored Content-Length value is obtained from the highest N level, and whether the Content-Length value reaches the set value is judged:

if so, all call request packets are combined to form a big packet and sent out.

If the set size is not reached, then the N-1 stages are checked sequentially down and the Content-Length values of their lower queuing queues continue to be accumulated, so on, until the last 0 stages.

After the merged packets are sent out, the merged calls need to be emptied from the queuing queue at the same time.

The timing duration (timeout duration for waiting for the call) has arrived:

all language pairs are checked in turn from the beginning, checking the process, as above [ upon arrival of a request call ].

Note that at the end of the check, if the size of the accumulated Content-Length value is less than the set size, the remaining queued packets are merged and issued. At the same time, queuing queues under all language pairs/priority levels are emptied.

4. Scheduling management:

1. receiving request packets, storing the request packets into different request priority queues according to priority parameters, and scanning intervals

The data packet with the packet length of 5 ms=2000 is directly sent to the background processing without merging the requests.

2. Triggering a distribution mechanism: combining request packets from the top of the queue starting from the highest level queue until a set packet length is reached

3. Combining rules: when the scanning time interval is up, only the requests of the same language to the direction are merged, and the requests are submitted to the back end for processing no matter whether the packet length meets the requirement or not.

5. GPU device detection

GPU device detection, comprising two aspects:

1. detecting the number and the type of the devices;

2. detecting busy and idle states;

(1) GPU device quantity detection

The number, name and UUID of GPU cards that the system has configured can be detected using the nvidia-smi command.

And inserts the corresponding value into the sys_settings table.

Insert into sys_settings(`key`,`value`,`memo`)

Values

(‘GPU 0’,‘GPU 0:Tesla P40’,‘GPU-dd57c03c-b961-3e93-638e-1ed51f29552e’)

Insert into sys_settings(`key`,`value`,`memo`)

Values

(‘GPU 1’,‘GPU 1:Tesla P40’,‘GPU-d5f59f07-4c1d-923e-491a-86d64af208fe’)

The detection method comprises the following steps:

the probe module probe_gpu_device needs to be executed outside the container.

Detection strategy:

in view of the fact that the device is fixed after installation. Positioning a detection strategy: and executing once.

(2) Detection of busy and idle of GPU equipment

The busy snapshots of all the GPU cards already configured in the current system can be detected by using nvidia-smi commands.

And inserting the snapshot result into a snapshot pipeline table workbench_gpu after processing.

Insert into worksum_gpu(`gpu_id`,`gpu`)values(‘0’,‘45’)

Insert into worksum_gpu(`gpu_id`,`gpu`)values(‘1’,‘23’)

The detection method comprises the following steps:

the probe module probe_gpu_idle needs to be executed out of the container at regular time. The timing interval may be configured by a configuration file. It is recommended to detect once 1 second, or 3 seconds.

Detection strategy:

1) Performing at fixed time;

2) To facilitate statistics and reduce the amount of data, the database is not written when the GPU is idle, i.e., has a value of 0.

Sixth, the newly added gpu_langpair table is used for storing the configuration situation of the gpu card language pair:

1.1 Table structure is as follows:

id gpu_name Langpair

1gpu 0EN-ZH

2gpu 1ZH-EN

1.2 calling script to execute gpu language pair switching:

script position:

1.3 newly added interface/uwsgi/admin/gpu_used, query gpu usage is displayed on the page.

Seventhly, a translation function start and stop button is added to the system, and the system is placed on a system parameter setting page.

1.1 when the stop button is pressed, the system does not accept the translation request, only allowing the background page to be managed.

1.2GPU card setup, system translation must be stopped (because the GPU configuration is restarted, preventing the data returned by the translating request from being abnormal) to perform configuration.

1.3 Page addition restart System functionality, restart background service

Eighth, newly added GPU card resets the language model function

1.1 adding a front-end page 'policy setting', wherein the page displays a GPU available card language model in the current system and a language model configured currently.

1.2 user selectable GPU card model is reconfigured by determining buttons to send the reconfigured language model to the backend.

1.3Python back end newly added interface for configuring GPU language model, requesting nmt_worker_d0/nmt_worker_d1 internal 10260 port flash server to make language model configuration

1.4 newly added flash server in GPU dock, receive the main program and dispose the request, carry out script and dispose the language model

Nine. configuration of haproxy

1.1 after the GPU language pair is configured, the pointing of the haproxy is reconfigured, and when the GPU language model is configured successfully, the script for configuring the haproxy is executed (the return value of view loss is problematic), (the execution in the excelery is executed)

1. Setting page in system (IP address page)

Adding subnet mask settings with a ok button to set IP and subnet mask

After the translation start-stop button is added and the popup window is needed to be confirmed, the page prompts the user that the translation function is stopped

Adding a system restart button, pressing a rear popup window to restart the system, and later refreshing the page to view the system

2. All statistical charts

DashBoard left pie chart language statistics interface debugging

4. Preparation of test data (python)

5. Dongle test

* Increasing user language versus designation and language versus disablement settings

6. Adding new language pair selections in the user frame, the user can select multiple language pairs

7. Modifying pages may also modify user language pairs

8. The disable setting may also set the disable setting for the language pair by the user

Python backend stores disabling information for different user-different language pairs

10. The caller center page is to display the language pairs available to the user (elegant display), user disable flags (need to be designed to be put in a reasonable place)

It should be noted that the embodiments of the present invention can be realized in hardware, software, or a combination of software and hardware. The hardware portion may be implemented using dedicated logic; the software portions may be stored in a memory and executed by a suitable instruction execution system, such as a microprocessor or special purpose design hardware. Those of ordinary skill in the art will appreciate that the apparatus and methods described above may be implemented using computer executable instructions and/or embodied in processor control code, such as provided on a carrier medium such as a magnetic disk, CD or DVD-ROM, a programmable memory such as read only memory (firmware), or a data carrier such as an optical or electronic signal carrier. The device of the present invention and its modules may be implemented by hardware circuitry, such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, etc., or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc., as well as software executed by various types of processors, or by a combination of the above hardware circuitry and software, such as firmware.

The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.

Claims

1. A MerCube machine translation management control system, the MerCube machine translation management control system comprising: an application layer and a platform layer;

the platform layer comprises: the device comprises a device management module, a container management module, a machine turning interface management module, a machine turning capacity management module and a copyright management module;

the integrated management module comprises:

integration unit with user third party system: responding to a call request of a user third party system by using an HTTP Post/Get mode or using a docx/execl file as a medium, and returning a translation result;

The upgrade management module includes:

and an uploading management unit: the method comprises the steps of uploading updated files, data models, configuration files, programs and scripts; wherein the script file is uploaded each time and must be contained; after the uploading is finished, the Mercube automatically executes the script to finish the upgrading of the back-end system;

script management unit: the method is used for uniformly managing all uploaded and downloaded scripts and cleaning up outdated scripts regularly;

the monitoring management module comprises:

system management unit: the system manager for managing the Mercube system comprises the steps of modifying, storing and setting system parameters of names and passwords;

calling a fee counting unit: the method comprises the steps of performing cost statistics on translation requests performed by a calling user in a period of time, and summarizing the cost statistics in a table form; carrying out language pairs, the translated word number and the requested word number statistics by taking hours, days and months as units;

call history statistics unit: the method comprises the steps of performing history tracking on a translation request performed by a calling user within a period of time;

device usage monitoring unit: a method for detecting device usage of a GPU card in real time, comprising: the busy and idle occupancy rate of the GPU in the past 1/4/8/24 hours; the number of requests for language pairs; average request completion time;

caller pricing management unit: the method is used for carrying out charging control when the call subscriber can carry out the translation request, and is also used for setting unit price.

2. The MerCube machine translation management control system of claim 1, wherein the device management module comprises:

GPU equipment distribution management unit: the method is used for carrying out static allocation on the GPU card and what language model is operated; or temporarily stopping a certain GPU card for distribution when the system is in normal operation;

GPU development environment management unit: the drive module is used for upgrading and testing the development environment of the GPU card;

the container management module includes:

and (5) mechanically turning the scheduling container unit: the method is used for one-key start-stop and management scheduling containers, and Nginx, HAproxy, redis, mysql and uWSGI are installed in the containers;

Mechanically turning the engine container unit: the system is used for one-key start-stop and management of the machine translation engine container; the container is internally provided with a mechanically turned core program and a mechanically turned module; deep learning the trained language model and configuration file;

the machine interface management module comprises:

call queue hierarchical management unit: the method is used for managing the priority level of the calling user; allowing the high-priority users to allocate resources preferentially to obtain responses; when the system is idle, the user with low priority applies for the translation resource;

flow control management unit: when the translation task is heavy and the request quantity of the user is large, starting a current limiting measure to ensure the rights and interests of all users and limiting access to the large-request user;

the optimization measure comprises the following steps:

(1) Scheduling machine turning resources and managing equipment of the GPU display card;

(2) Caching translation results by utilizing the characteristic of quick access of the redis memory database;

(3) Merging a plurality of small requests of a user, submitting background translation at one time, and returning a result to be submitted in comparison with a single sentence;

the current limiting control includes: the current limiting mode is divided into static control, dynamic control and mixed control;

static control: aiming at a certain user, a fixed flow control threshold parameter is set in advance; when the user reaches the threshold value, limiting the use of the user for a period of time;

dynamic control: detecting the time of calling the translation interface each time and returning the background, and judging the busy and idle of the background translation task; the response time of the call must be less than a certain value beyond which the current is temporarily limited;

mixing control: simultaneously supporting dynamic control and static control; dynamic and static; after the dynamic auditing is passed, auditing by using a static method;

the current limiting control use includes:

the flow control threshold parameters can be combined and used by utilizing the optimizing strategy management unit; meanwhile, the current limiting measure can be used together with the authority management of the user;

the machine turning capacity management module comprises:

Translation capability management unit: the method is used for manually adjusting a language model loaded by a GPU card installed on the Mercube system based on GPU equipment management so as to meet the requirement of a certain language for larger translation in a certain time period on business;

the copyright management module includes: adopting various measures to protect intellectual property of the system;

3. A MerCube machine translation management control method for operating the MerCube machine translation management control system of claim 1, the MerCube machine translation management control method comprising the steps of:

step two, designing a routing rule and scheduling the data packet;

4. The MerCube machine translation management control method according to claim 3, wherein the routing criteria in the step two includes:

request level Priority dynamic definition, stepless number definition; the specific number of steps is determined and requested by the application party of the translation request;

the request level is represented by a single linked list, and the request is represented by a queue; wherein, the N level is highest, and the 0 level is lowest;

(1) Routing in rule: when a request call arrives: firstly, quickly searching a language pair LangPair) linked list, searching a Priority linked list at the same level after the language pair LangPair) linked list is found, and after the language pair LangPair linked list is found, queuing new requests at the tail of all queuing calls in the linked list; if the language pair is not found, creating one, and inserting the one at the tail of the language pair linked list; if the priority level is not found, creating one, arranging in a descending order, and inserting the one into a proper position in a priority level queue; after the calls enter the queuing queue, counting Content-Length values of all the calls under the priority level, accumulating the Content-Length values, and storing the accumulated Content-Length values in a priority level node;

(2) Routing out rules:

2) Priority level: prioritized; firstly, outputting an N-level linked list, after finishing outputting, outputting N-1 level, and so on until the last 0 level;

under the same-level linked list, firstly, call_0 is output, then call_1 is output, and so on until call_n is finally output;

4) The timing duration has arrived at: sequentially checking all language pairs from the head to the head, and checking the processing procedure, which is the same as the processing procedure of the above [ when a request call arrives ];

5) At the end of the check, if the accumulated Content-Length value is smaller than the set value, the remaining queuing packets are merged and sent out; meanwhile, the queuing queues under all language pairs/priority levels are emptied;

in the second step, the scheduling management includes:

(3) Combining rules: when the scanning time interval is up, only the requests of the same language to the direction are merged, and whether the packet length reaches the requirement or not is submitted to the back end for processing.

5. An information data processing terminal for implementing the MerCube machine translation management control method according to any one of claims 3 to 4.

6. A computer readable storage medium comprising instructions which, when run on a computer, cause the computer to perform the MerCube machine translation management control method of any of claims 3 to 4.