Performance, capacity and scalability
Performance is a measure of the speed of a system, for example how quickly a website responds to user input.
Capacity is a measure of how much data a system can process or store before performance becomes unacceptable, for example how many customers can be held on a customer database.
Performance and capacity depend on many factors.
- Software design and configuration.
- Processor speed.
- Memory size and speed.
- Data storage size and speed of access.
- Network speed.
The effect of these factors can be subtle. For example:
- Both latency (time before data is moved) and throughput (amount of data that can be moved per unit time) impact speed. For example, if a system makes many small accesses across a network, the latency of the network will have a bigger impact than the throughput.
- System configuration has a big impact. Many devices and software use some form of cache, a local store of commonly accessed data which speeds up overall times. Memory, processors, disks, databases and web servers all use some form of caching. Configuration of caching can have a major effect.
Performance and capacity are usually dictated by a limiting factor, or "bottleneck". For example, if a system is limited by the speed with which data is read from disk, increasing the number or speed of processors will not increase performance.
As load increases toward capacity, systems should cope with excess capacity gracefully. In an online system, this could be achieved by queuing requests to process later, or by cleanly rejecting new requests. But you must avoid the system "grinding to a halt" or failing midway through requests in a way that then undermines the integrity of the data.
Scalability is a measure of how easy it is to increase performance and capacity.
Scalability is a serious issue on large systems. There are a number of approaches.
- Upgrading to faster hardware. This is known as "scaling high", and works provided that the limiting factor is itself upgraded.
- Running the system over more servers. This is known as "scaling wide". By splitting the processing down into smaller units, it may be possible to sidestep bottlenecks. The additional need to control the workload between servers adds a new overhead and may itself become a bottleneck.
Planning, modelling, testing and monitoring
Performance and capacity need to be managed throughout the system life cycle. During development, capacity needs to be planned and performance tested. When the system us run live, performance needs to be monitored as do key statistics about capacity such as processor and memory usage.
Size systems to meet peak processing requirements, for example the busiest hour on the busiest day.
Over-capacity
Because of the difficulties of correctly calculating capacity requirements, many systems end up with capacity that they can never use. Some systems are deployed on expensive hardware that scales well even though the system is not likely to need the additional capacity before the hardware is obsolete.
Over-capacity can at times be very wasteful. It is not unusual to find servers that cost tens or hundreds of thousands of pounds that are hardly used.
Two technologies help reduce over-capacity.
- Clustering technology, in which a system can run over multiple servers, allows capacity to be added piecemeal when it is required.
- Virtualization, in which a physical server is configured to run as multiple logical servers. The logical servers can be rearranged between physical servers to allow for growth without requiring over-capacity.
Management tips
- When calculating capacity, it is sensible to add a margin for error. However, if you add a margin for error to each input and each calculation, you will end up with significant over-capacity. Calculate capacity using best guess figures at each stage of the calculation, and then add a margin only to the final result.
- If you have performance or capacity problems, don't rush to upgrade hardware. Find out the limiting factor. Often problems can be addressed by relatively minor changes to software or configuration, such as increasing cache sizes or rewriting data access. Ignoring underlying performance problems and merely purchasing more hardware leads to high costs in the long run.
- The management of performance, capacity and scalability of large or critical systems will benefit from specialist assistance.
See
Wikipedia articles on Performance engineering, Capacity management and Scalability.
