Introduction ============ This document describes how the Azure provider works; how Azure concepts are mapped, and some high-level descriptions of how the provider is implemented. References ========== Service Management REST API Reference http://msdn.microsoft.com/library/azure/ee460799.aspx "What is a cloud service?" (This is out of date, but still somewhat useful.) http://azure.microsoft.com/en-us/documentation/articles/cloud-services-what-is/ Cloud Services, Virtual Machines and Virtual Network SLA https://www.microsoft.com/windowsazure/sla/ Azure Concepts ============== Affinity Group An Affinity Group is used to group VMs, storage, etc. within a location. It is possible to simply associate each of the entities with the same location, however it is recommended to use an Affinity Group. Azure may then locate entities as close to each other as possible within a data- centre, as opposed to just being in the same data-centre. We create an Affinty Group per environment called "juju--ag". Cloud Service (AKA "Hosted Service" in API) A Cloud Service is a container for VMs, and is the publicly-addressable entity for Azure's compute service. Each Cloud Service must have a globally-unique name, which is used to derive its domain name. A Cloud- Service has only one domain name and public IP address. We currently prefix Cloud Services that we create with "juju--", and randomly generate the remainder of the name. Deployment A Cloud Service may have up to two deployments: Production and Staging. The deployments may be swapped at runtime, so that a Staging deployment can be upgraded whilst maintaining a Production deployment that can be swapped back to should the upgrade go wrong. We currently only ever create a Production deployment. Role For our purposes, a Role describes a Virtual Machine (VM). The description includes the hostname, initial username and password, attached disk(s), network configuration, and which Availability Set (if any) the role belongs to. A role may have multiple instances (Role Instance), however we only ever have one per role. The Juju "Instance" implementation for the Azure provider maps to a Role Instance, although the implementation details differ between the current and the original implementation. A role/instance has only Virtual IPs (VIPs), i.e. internal addresses. The role may expose "endpoints", i.e. TCP/UDP ports. Each port may be exposed by at most one role in a Cloud Service, unless that port is part of a load- balancing set. When a client connects to the containing Cloud Service's address + the endpoint port, the traffic will be routed to the role instance(s) that expose that endpoint. Availability Set Each Cloud Service has zero or more Availability Sets within it; a Role can be assigned to at most one of them. As long as there are at least two Roles in the same Availability Set, then Azure will guarantee at least 99.95% availability under the Azure Service Level Agreement (SLA). We create a single Availability Set for each Cloud Service, and all roles are added to it. Thus, all Juju-deployed services are, by default, covered by the Azure SLA. Azure provider implementation ============================= Bootstrapping Before the Azure provider can bootstrap an environment, it expects one and only one thing to be pre-existing: a storage account; this storage account is used for the environment's Storage() implementation. When the environment is bootstrapped, the provider creates an affinity group and a virtual network. The Azure provider now has two modes in which it can operate, configured at bootstrap time with the "availability-sets-enabled" boolean attribute. If it is true (henceforth "azure-mode"), then support for multiple Roles per Cloud- Service is enabled; otherwise there is a single Role per Cloud Service. By default, we set availability-sets-enabled=true for new environments. Note: by setting availability-sets-enabled=false, you will gain the abililty to place units on any machine at the cost of losing Azure's SLA guarantees. See "Starting instances" below for more details on what is prevented by azure-mode. Starting instances When *not* in azure-mode, the provider will simply create a new Cloud- Service, Deployment, and Role. The user may use placement directives and add machines like in any other provider, but they will not have any availability guarantees provided by Azure. When in azure-mode, the provider must impose several restrictions to achieve availability guarantees: - placement (i.e. --to ) is disallowed, - "juju add-machine" is disallowed. Thus, an instance can only be created in response to adding a unit to a service, and an instance can contain only units of a single service type. Instances are deployed to the same Cloud Service as the other instances of its "Distribution Group" (a new Juju provisioning concept introduced to support azure-mode, Availability Zones, etc.), or a new Cloud Service if the group is empty. For a state server, the Distribution Group is the set of instances that contain other state server instances; for any other instance, the Distribution Group is the set of instances that contain units of the same service that the instance is being provisioned for. All new instances are added to the same Availability Set within the Cloud Service. Instance mapping In the Azure provider, we map instance IDs to a Role within a Cloud- Service. The instance-id format is servicename-rolename, where servicename is the name of the Cloud Service and rolename is the name of the Role. We only ever have one Deployment in a Cloud Service, so it is inferred. Originally the provider was written such that instance IDs mapped only to Cloud Services, as originally there was a one-to-one correspondence between Juju-deployed Cloud Services and Roles. This has had to change to cater for Azure's recommended high availability practices. We maintain backwards compatability with old deployments; existing Cloud Services are assumed to have a single role within. Exposing Services Because a Cloud Service has a single public IP, and instances within do not have their own, exposed services will have their ports load balanced across each instance whose unit opens the port. Load balancing only takes effect when connecting to the public (Cloud Service) address. In azure-mode, the state server and API server ports are automatically load-balanced. When HA Juju is enabled, this means that API client connections will be load balanced automatically by Azure. In azure-mode, the SSH port is also load-balanced so that port 22 may be used uniformly across providers. To ensure that "juju ssh" connects to the correct instance, we connect to the instance's internal address by first proxying through the API server.