Developer’s Role in Achieving High Availability

407 VIEWS

·

There are many ways to improve application availability. You can add more infrastructure in order to provide more resources to power your applications, and hopefully get an availability boost from it. You can improve the monitoring and resolution processes of your IT Ops team in order to reduce downtime.

And you can also write better, more resilient code. Improving code is often not easy, but it is the best thing developers can do to improve availability.

This article examines several strategies developers can follow in order to build more highly available applications.

Configuration

All configuration should be externalized as much as possible. The easiest way to do this is by leveraging environment variables, which can be picked up at runtime so operational teams can deploy multiple instances in multiple environments, without ever needing to worry about adjusting property files or repackaging the app in any other way. Regardless of the language in use, environment variables are always available.

Environment variables are best for things that are required to start and get an application running, like server addresses, port numbers, and security credentials. These are items which operational teams will want to adjust potentially on every server, and will be included in the deployment pipeline.

Sample Ways to Access Environment Variables

 PHP
$_ENV['VARIABLE']
Java
System.getenv('VARIABLE')
Python
os.environ['VARIABLE']
JavaScript
 / Node.js
process.env.VARIABLE

You can also store configuration values in the database, but this is a less common approach. Normally, you’d use databases to store data that the application needs after it starts up, rather than variables that will need to be available when runtime starts.

Data Tier

Most applications are data-driven, and that data access and management layer can be a large source of high-availability problems. The largest sources of problems I’ve seen are related to locking in the data store.

By default, most data stores only lock the row or document that is explicitly being updated or deleted. A developer, however, can write an application in such a way that it locks more in order to improve availability and data protection. For example, an entire table or collection could be locked:

Table Locking

 Good
Updating the Schema

Exception Handling

Exception handling for connections is extremely important in high-availability scenarios and is overlooked more often than any other problem I’ve experienced in my career. There are two main types of exceptions that need to be caught, handled and set to retry the connection:

  1. The stale exception from when the server has closed the connection, but the client still thinks it’s open. A common scenario here is a database server that has failed over to its secondary node, and the client tries to reuse a connection when its pool will fail.
  2. Failed to connect, either by timeout or connection refused. If a server crashes, depending on the database platform in use, failover may take a few seconds. That can be long enough to disrupt availability.

Depending on the language you are using to connect to a database, there will be a database connection manager available that will usually support pooling, which is an efficient way to use database connections.

If you are using Spring with Java, then this is what you would add to your datasource connection property:

 
  
  
  
  
  
  
  

Core Application Tier

The core of the application is where the business functionality is written and all the valuable work is done. To ensure it can be as scalable as possible, it is best to go as close to stateless as possible. In traditional applications, all the tiers of the application were in a single deployable unit, and being highly available and scalable was only really necessary for the largest of applications. In today's model of microservices and micro-frontends and mobile applications, having the session information and other stateful information isn’t nearly as valuable, and makes adding and removing instances of the application more complicated than it needs to be.

The idea of a stateless application is that no data about any request persists in the application once the request has finished. This means every request is self-contained. While at first this may not sound as efficient as keeping everything (like the client’s shopping cart) in memory, it provides a level of reproducibility and scalability that a stateful application just can’t achieve without getting into complex in-memory grid caches.

A stateless application, when receiving a request, will use the information in that request to retrieve everything it needs to process the transaction. The sample below shows the incoming HTTP headers that a stateless app could receive.

Request HTTP Headers

 POST /api/v1/add-item-to-cart HTTP/1.1
Host: store.example.com
Authentication: Bearer token1234
Content-Type: application/x-www-form-urlencoded
Content-Length: 22

item_id=42&num_items=5
 Line 1 is showing it is a POST so it is sending data to the server
Line 2 shows the host it is sending to
Line 3 is the authentication information, we will use the token to retrieve which user we are working on
Line 4 shows the type of HTTP Post, other options are like application/xml or application/json
Line 5 shows the length of the content being sent in
Line 6 is the content being submitted, in this case adding 5 of item 42 to the cart

In the background, the service will look up the user, then update the cart record in the data store, and then return a response that looks like the following, showing it was successful with status code 200.

Response HTTP Headers

 HTTP/1.1 200 OK
Date: Mon, 15 May 2018 12:28:53 GMT
Server: Apache
Last-Modified: Wed, 15 May 2018 12:28:53 GMT
Content-Length: 44
Content-Type: text/html
Connection: Closed




Success!

 Line 1 is the status code (200 = Success)
Line 2 is the date it was processed
Line 3 is the server id which did the processing (Configurable, should be hidden in production)
Line 4 is the date the document was last modified, for dynamic this should always be now
Line 5 is the length of the content in the response
Line 6 is the format of the response
Line 7 is if this connection will stay open after this response
Line 8+ is the content

If performance is a concern when you are considering stateless applications, just remember that the model of an externalized configuration allows operations to add and remove caching layers in front (ex: mem_cached for MySQL) of your applications and its dependencies, which can drastically speed up performance without requiring code changes or introducing a single point of failure.

Conclusion

Developers need to be mindful and consciously make an effort to keep up with and follow best practices to ensure the applications they are delivering can be deployed in a highly available configuration, and fully take advantage of the elastic infrastructure that is available on-demand in both public and private clouds.


Vince Power is an Enterprise Architect at Medavie Blue Cross. His focus is on cloud adoption and technology planning in key areas like core computing (IaaS), identity and access management, application platforms (PaaS), and continuous delivery.


Discussion

Click on a tab to select how you'd like to leave your comment

Leave a Comment

Your email address will not be published. Required fields are marked *

Menu