Data replication ensures optimum quality of service all the way to the edge of the enterprise where connectivity is intermittent or limited, says Steve Driver.
Geographic remoteness and poor connectivity mean many oil and gas organisations encounter issues when attempting to keep remote sites or workers up-to-date with essential data.
Although the increasing pervasiveness of broadband fixed and wireless networks has made physical proximity to the central office much less important than it was, the fact remains that connectivity varies dramatically outside of built-up areas, while the internet was never intended as a backbone for business applications.
The problem of inconsistent connectivity and limited bandwidth is compounded by the trend towards ever-more distributed enterprises. The growing number of mobile workers and remote sites spanning home, branch, regional and international offices, has meant the centralised architectures employed by most business applications today are out of sync with the distributed enterprise.
Latency is a major source of frustration for users of remote applications.
Network outages and connectivity issues aside, the fact that an application must communicate via a network during its operation introduces noticeable delays in processing and usability – even over high-speed networks. The effect becomes more pronounced as the distance between the user and the data centre increases.
Where organisations are attempting to run applications locally at remote sites that use large database systems such as Microsoft SQL, Oracle, or IBM DB2 at head office, it can be almost impossible to achieve the level of quality of service (QoS) required to keep the local instance (the slave) in sync with the central database (the master).
This is why a fundamental change in the way business applications are architected is required by users such as oil and gas operators – for whom access to current data is business or mission critical.
Delivering a distributed model
There are several ways in which to provide improved access to applications and data. The first step is to recognise the internet’s inability to provide reliable access, and change its role in the application architecture from ‘mission-critical backbone’ to ‘occasionally needed service’. Based on this approach, there are four core technology options:
* N-Tier Client Server – closest to the traditional, in-house, centralised application environment, this scenario involves a central database server and deployment of robust client applications at each remote site or user location. Network connectivity is essential, with performance tied to available bandwidth and reliability hinging on network availability;
* Thin Client ‘Application Access Portals’ – a remote control operation where network dependant terminals access one or more central servers. Each user has their own virtual machines running on these central servers, on which the applications are loaded and executed. Regardless of bandwidth requirements, network connectivity is necessary to use the application. Latency may be an issue as keystroke and GUI data must be sent between the thin client and the server;
* Web Client – encompasses several different client implementations, the most common being web browser based, ‘thick client’-based applications using web services technologies, or server-deployed but locally executed. As with the previous two approaches, network and server reliability is the determining factor for application availability;
* Distributed Applications & Data – involves deploying independent, replicated database instances together with a robust client application, either in a remote office or on a user’s laptop. The database needs to be synchronised at regular intervals, with frequency dependent on application and business requirement. This solution can tolerate frequent network outages and bandwidth restrictions and still allow remote users to continue working.
Breaking from the past
Decentralisation, or the distributed data and applications approach, has become a major trend because when applications are deployed in this way, remote offices do not shut down when network connectivity is slow or lost, and the data centre is no longer a single point of failure and off-the-shelf computers can be employed. Rather than providing all access to the data and business logic from a central location, processing is distributed to smaller servers across the enterprise, reducing the need for expensive, multi-processor servers at the data centre.
However, centralised systems may still be needed for reporting purposes or to serve large sites. In addition, there is a potential for several disadvantages depending upon the technologies chosen to construct a decentralised or distributed solution:
* Extensive application redesign – some middleware solutions require special APIs for an application to communicate with the database;
* Data conflicts – some data replication solutions also require extensive administrative intervention to resolve data conflicts (because the typical unit of replication is an entire data record);
* High bandwidth utilisation – most replication solutions send the full content of all transactions to each site, including all intermediate changes to the same data. The complete dataset must also be maintained at all sites;
* High maintenance burden – log-based replication solutions need to be synchronised periodically. Additionally, when sites haven’t replicated for extended periods of time, the log file may fill the server’s disk and result in downtime and an unscheduled synchronisation session.
These issues can be addressed using a variety of approaches:
* Multiple programming language support – a solution that supports any programming language can eliminate the need for application changes to access or update the database;
* Partitioning records by update authority – enables administrators to allow simultaneous updates when they comply with the business rules defined by the user and thus avoid false conflicts;
* ‘Net change’ model – bandwidth utilisation is reduced substantially if only data that has been updated since the last replication is sent through the network. It is also possible to partition data so that only the information pertinent to a specific site is sent through the network;
* ‘Live’ database access during replication sessions – means synchronisation is not required and log files do not need to be managed.
Database replication technologies allow a rich-client interface to operate uninterrupted via a local database, even during periods of complete network unavailability. They can then allow updates to stream back and forth over the network during periods of acceptable network QoS. This distributed or decentralised model gives all workers – whether they are in a remote officeor working from a rig/at sea and using a satellite link – equal access to perfectly performing and fully functional enterprise applications.
Enabling disconnected use of fully functional applications and data is an essential requirement for any distributed approach. This does not mean providing users with read-only versions of their data. It means fully functional, read-write access to data as if they were still connected to the network without degrading application performance. This is achieved using asynchronous update-everywhere replication, as opposed to less efficient message-based or synchronous replication.
Asynchronous update-everywhere replication allows organisations to manage their disconnected remote sites and mobile workforce centrally from the office, regardless of latency or bandwidth. Moreover, it does not rely on email or FTP, and it does not require all sites to be available at the same time for replication to take place.
Such solutions are particularly important in sectors such as marine transportation, offshore oil and gas, and manufacturing and production, but are just as applicable for any organisation needing to manage data across multiple sites, geographies, platforms, or database management systems.
Crucially, users do not have to be connected to a network to access their data. Instead, they can obtain up to date information at any time with the same levels of QoS, performance and management costs as those in the central office – using either simple data replication or complete synchronisation.
PTTEP Australasia operates a floating production, storage and offloading (FPSO) vessel supporting its Montara, Swift and Skua oilfields in the southern Timor Sea. To support this sophisticated vessel, the operator needed to integrate offshore maintenance and onshore procurement processes. Moreover, to meet regulatory requirements, it needed to implement a computerised maintenance management system for the FPSO.
In previous projects, it had been using separate solutions for maintenance and procurement. However, the interfaces were different, so if engineers were working on a maintenance task and needed to order some materials, they had to switch to the other application. Workflow management was also difficult, so a lot of processes involved printing out work orders and sending paperwork around for approvals. PTTEP implemented IBM Maximo, an asset management system to integrate offshore maintenance and onshore procurement as part of an electronic maintenance management system.
The biggest challenge however, was to find a way to keep PTTEP’s maintenance and procurement processes in sync between its onshore offices in Perth and Darwin and its offshore environment on the FPSO. When the FPSO is at sea, it is only able to communicate with onshore sites via a satellite link – which means the bandwidth is relatively low, and there are times when it cannot connect at all.
For this reason, PTTEP needed to have separate instances of Maximo in Perth, Darwin, and on the FPSO.
To keep the instances of Maximo in sync, PTTEP implemented DataXtendRE (DXRE) an asynchronous data replication solution that compresses data to minimise satellite bandwidth requirements and automatically transmits and synchronises data between three linked instances of Maximo: a master node that handles background processes and reporting; a slave node for the onshore users to log into; and another slave node on the FPSO itself.
New or changed data is replicated from the slaves to the master and vice versa, so that all three systems are kept in sync over a low bandwidth satellite connection even though they operate independently. Should the connection be lost temporarily, the FPSO’s Maximo instance can still support maintenance tasks and procurement processes. This ensures PTTEP complies with its regulatory obligations.
Steve Driver is with software developer DXSTRO, Eccles, UK.