Pattern - Self-Tuning Applications

Contents

Self-Tuning Applications

This is an intriguing concept: applications that alter their own parameters in order to tune their behaviour for a specific installation.

It has appeal in the deployed-server-cluster model of deployment; in which one or more machines provides services for the clients, usually with a database behind everything. The JVMs of the front end machines have various options related to garbage collection; the database itself has index tables and other configurable aspects of system structure.

Skilled operators and DBAs can tune these parameters to get the most out of the system; altering garbage collection for maximum throughput or lowest latency, while the database can take its rates of reads and writes into account to produce a layout optimised for the load.

But what if the servers themselves could do this tuning? They'd be able to adapt to demand in real time, optimising themselves for specific needs. It could be tuning application server and database parameters, or it could even be by requesting and provisioning new front end/back end server machines (real or virtual) based on demand.

The hard part here is determining what options to tune, and how. Some work on testing has innovated here, in particular the work on Skoll, which deliberately changes settings in an attempt to determine which options break an application -which configurations do not work. Determining which configurations work best is a harder problem, almost AI-hard. It is certainly possible to imagine an experimental system which gradually tries different options and evaluate the effectiveness of each. There is a risk of hill climbing in such an algorithm; the runtime will reach a position that is more optimal than any of its neighbours, but still less than what could be achieved. If the tuning is applied to a live system there is another risk: that bad configurations would reduce system performance or availability. Perhaps some of the machines in a cluster could be used for this self-tuning, and if effective, the options could be rolled out more broadly.

Features

  • Applications/Systems decide what configuration/optimisation parameters to use
  • This could be based on real-time data, or on historical information.
  • Applications have a way of assessing the outcome of the changes (latency, GC delays, throughput), and so assess the value of a change.
  • Changes can be purely experimental (trial and error), or based on experience. This could be hard-coded experience, or it could be information shared from the results of other experiments.

Advantages

  • The goal is to automate the skills of experienced system and database administrators.
  • Reduced costs.
  • Ability to adapt dynamically to changing system state.
  • Can take advantage of Virtualized/On-demand infrastructures
  • Can perform time-consuming experimentation that humans would find a waste of time.
  • Goes hand-in-hand with Continuous Deployment. Delegate the tuning to the machines, just as deployment is automated.

Disadvantages

  • Hard to get right.
  • Risk of visible loss of service if it is goes very wrong.
  • Still a research area
  • If the underlying infrastructure is very agile (e.g. Virtual machines being moved on a regular basis), even very recent data cannot be relied on to be accurate.

One example of this technology at work today is the hotspot Java compiler from Sun; in the -server JVM, it optimises the generation of native code based on the performance statistics of the ongoing run. It does not use information from previous runs, nor does it tune heap management/garbage collection options. There is much scope for improvement here.

SmartFrog support

This is still something that we are researching. What SmartFrog and other Declarative Configuration Languages offer is a way to go from generated configurations to deployed systems. This is a foundation for self-tuning applications, but is not in itself complete.

Get SmartFrog at SourceForge.net. Fast, secure and Free Open Source software downloads