Skip to main content

Solution Architect

Persona

Creates technical solution designs that meet functional requirements while minimizing energy consumption and resource waste.

16 patterns

Architecture

Adopt serverless architecture for AI/ML workload processes

Building an ML model takes significant computing resources that need to be optimized for efficient utilization.

  • ai
  • machine-learning
  • serverless
  • size:small
Containerize your workloads

Containerizing workloads enables better resource utilisation and bin packing, reducing unnecessary compute allocation and embodied carbon compared to running full virtual machines.

  • cloud
  • compute
  • kubernetes
  • role:cloud-engineer
  • size:medium
Implement stateless design

Service state refers to the in-memory or on-disk data required by a service to function. State includes the data structures and member variables that the service reads and writes. Depending on how the service is architected, the state might also include files or other resources stored on the disk. Applications that consume large memory or on-disk data require larger VM sizes, especially for cloud computing where you would need larger VM SKUs to support high RAM capacity and multiple data disks.

  • cloud
  • compute
  • kubernetes
  • role:software-engineer
  • size:medium
Queue non-urgent processing requests

All systems have periods of peak and low load. From a hardware-efficiency perspective, we are more efficient with hardware if we minimise the impact of request spikes with an implementation that allows an even utilization of components. From an energy-efficiency perspective, we are more efficient with energy if we ensure that idle resources are kept to a minimum.

  • cloud
  • size:small
Reduce network traversal between VMs

Placing VMs in the same region or availability zone minimises the physical distance data must travel between instances, reducing the energy consumed by network traversal.

  • cloud
  • compute
  • kubernetes
  • network
  • role:cloud-engineer
  • size:medium
Run AI models at the edge

Data computation for ML workloads and ML inference is a significant contributor to the carbon footprint of the ML application. Also, if the ML model is running on the cloud, the data needs to be transferred and processed on the cloud to the required format that can be used by the ML model for inference.

  • ai
  • machine-learning
  • size:small
Scale logical components independently

Decomposing applications into independently scalable microservices allows each component to be right-sized for its own demand, reducing overall compute resource consumption and embodied carbon.

  • cloud
  • compute
  • kubernetes
  • role:cloud-engineer
  • role:software-engineer
  • size:medium
Use a service mesh only if needed

Service meshes add overhead through additional containers and increased network traffic, so they should only be deployed for applications that genuinely require the capabilities they provide.

  • cloud
  • kubernetes
  • network
  • role:cloud-engineer
  • security
  • size:medium
Use serverless cloud services

Serverless cloud services scale dynamically with demand and share infrastructure across many applications, reducing idle resource consumption and lowering embodied carbon emissions.

  • cloud
  • serverless
  • size:small

Operations

Optimise storage utilization

It's better to maximise storage utilisation so the storage layer is optimised for the task, not only in terms of energy proportionality but also in terms of embodied carbon. Two storage units running at low utilization rates will consume more energy than one running at a high utilization rate. In addition, the unused capacity on the underutilised storage unit could be more efficiently used for another task or process.

  • size:small
  • storage
Optimize average CPU utilization

CPU usage and utilization varies throughout the day, sometimes wildly for different computational requirements. The larger the variance between the average and peak CPU utilization values, the more resources need to be provisioned in stand-by mode to absorb those spikes in traffic.

  • compute
  • monitoring
  • size:medium
Optimize peak CPU utilization

CPU usage and utilization varies throughout the day, sometimes wildly for different computational requirements. The larger the variance between the average and peak CPU utilization values, the more resources need to be provisioned in stand-by mode to absorb those spikes in traffic.

  • compute
  • monitoring
  • size:medium
Scale infrastructure with user load

Demand for resources depends on user load at any given time. However, most applications run without taking this into consideration. As a result,resources are underused and inefficient.

  • cloud
  • compute
  • size:medium
Scale Kubernetes workloads based on relevant demand metrics

By default, Kubernetes scales workloads based on CPU and RAM utilization. In practice, however, it's difficult to correlate your application's demand drivers with CPU and RAM utilization. Scaling your workload based on relevant demand metrics that drive scaling of your applications, such as HTTP requests, queue length, and cloud alerting events can help reduce resource utilization, and therefore also your carbon emissions.

  • cloud
  • kubernetes
  • role:cloud-engineer
  • role:software-engineer
  • serverless
  • size:medium