AEM in the cloud?

Setup of the AEM in the cloud nowadays is easier and more automated than ever. Adobe Cloud Manager provides automated tools for continuous integration, delivery (CI/CD) and tests of the application code deployed on top of the cloud environment. Different AEM deployment models are explained in this Cognifide blog post. But what if we would like to run AEM in the cloud (AWS, Azure, GCP etc.) outside the Adobe framework for example for development or testing purposes?

Setup of AEM in the cloud can be divided into setup of the underlying infrastructure and deployment of the AEM application. It requires some knowledge about cloud services, infrastructure security, automation frameworks and DevOps approach to make this process easy but still flexible. What does it mean?

Infrastructure as a Code (IaaC) together with Configuration Management paradigms allow us to design and implement the framework for setup of the AEM in the secure and reliable virtual environment in the cloud using Gradle AEM Plugin (GAP). We called it GAp-In-Aws (GAIA) because currently it supports AWS cloud infrastructure setup, but solution described in this article can be cloud agnostic, it just requires adapting Terraform configuration for chosen cloud provider. Sounds good?

Solution architecture

In the GAIA proposed solution, we divided process of setup AEM in the cloud into two parts:

  • Infrastructure management – this sub-process should be run first. During this part we configure basic cloud infrastructure. We use Terraform to setup EC2 instance with EBS volume in the cloud and AWS CLI (Command Line Interface) to configure DNS record. This part of the process is strictly related to the services and capabilities provided by the cloud operator.
  • Configuration management – this sub-process should be run on top of the running cloud infrastructure. During this part we install all needed tools, run AEM and httpd with dispatcher. We use Ansible for configuration of the AEM. This part of the process is cloud agnostic – it can be run over any cloud (or even on prem) environment which provides Unix-based runtime and access via SSH (Secure Shell) protocol.

GAIA solution architecture

Nice thing about GAIA is that it allows us to spin up a new environment quickly and enable testing per feature branch. It makes the process suitable for automation of the feature branch tests on a unique environment with dedicated URL.

With the security in mind GAIA requires the following permissions:

  • Programmable access to the cloud environment for Terraform to setup EC2 instance (in case of AWS: access keys)
  • Programmable access to the DNS server for configuration of the DNS record which binds IP address of the EC2 instance with domain name (at least write permissions)
  • Access to EC2 instance via SSH for Ansible (in case of AWS: PEM security key)
  • Access to the AEM application code repository
  • Access to the AEM jar and license files or already prepared GAP instance backup as .zip file

With all these security permissions, GAIA can be run from any developer workstation (OSX, Linux or Windows with WSL). But it would be more desirable to setup CI/CD jobs (e.g. in Jenkins, Bamboo, Azure DevOps) running on agents with all of these permissions which allow them to perform both infrastructure setup and AEM configuration.

How simple is it?

There are two main parts to successfully create a ready-to-use environment. Below is the example of running full setup on AWS and use custom DNS for routing. All steps are completely idempotent, so there is no risk to create more than one or override existing environment.

Cloud Infrastructure

Starting point is to align Terraform configuration to desired requirements by customizing parameters like region, domain, instance type, whitelisting and many others. At this stage, provider key must also be configured. The next step is to execute:

terraform apply

Then Terraform provides list of planned actions to perform over the cloud which should be confirmed by the user. Here is the example output from Terraform after setup of the infrastructure inside the AWS cloud:

  ...
  # create EC2 instance
  + resource "aws_instance" "gaia-ec2-instance" {
      + ami                          = "ami-022641bd898ee8b20"
      + availability_zone            = "eu-central-1a"
      + get_password_data            = false
      + instance_type                = "m5.xlarge"
      + key_name                     = "instance_key"
      + source_dest_check            = true
      + tags                         = {
          + "Env"  = "test"
          + "Name" = "gaia-ec2-instance"
        }

      + ebs_block_device {
          + delete_on_termination = true
          + device_name           = "/dev/xvdb"
          + volume_size           = 30
          + volume_type           = "gp2"
        }


  # create security group
  + resource "aws_security_group" "sec_group" {
      + description            = "Managed by Terraform"
      + egress                 = [
          + {
              + cidr_blocks      = [
                  + "0.0.0.0/0",
                ]
              + description      = ""
              + from_port        = 0
              + ipv6_cidr_blocks = []
              + prefix_list_ids  = []
              + protocol         = "-1"
              + security_groups  = []
              + self             = false
              + to_port          = 0
            },
        ]
      + id                     = (known after apply)
      + ingress                = [
          + {
              + cidr_blocks      = [
                # list of allowed IPs
                ...
                ]
              + description      = ""
              + from_port        = 22
              + ipv6_cidr_blocks = []
              + prefix_list_ids  = []
              + protocol         = "tcp"
              + security_groups  = []
              + self             = false
              + to_port          = 22
            },
          + {
              + cidr_blocks      = [
                # list of allowed IPs
                ...
                ]
              + description      = ""
              + from_port        = 4502
              + ipv6_cidr_blocks = []
              + prefix_list_ids  = []
              + protocol         = "tcp"
              + security_groups  = []
              + self             = false
              + to_port          = 4503
            },
          + {
              + cidr_blocks      = [
                # list of allowed IPs
                ...
                ]
              + description      = ""
              + from_port        = 80
              + ipv6_cidr_blocks = []
              + prefix_list_ids  = []
              + protocol         = "tcp"
              + security_groups  = []
              + self             = false
              + to_port          = 80
            },
        ]
      + name                   = "sec_group"
      + owner_id               = (known after apply)
      + revoke_rules_on_delete = false
      + vpc_id                 = (known after apply)
    }

  + resource "null_resource" "export_render_values" {
      + id = (known after apply)
    }

Plan: 3 to add, 0 to change, 0 to destroy.

Changes to Outputs:
  + public_ip = (known after apply)

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

After around ~3 minutes AWS instance is up and running. This is the time to configure DNS routing:

aws route53 change-resource-record-sets --hosted-zone-id {id} --change-batch file://configureDNS.json --profile dns

At the end of this process we have virtual machine running in the cloud and DNS configured to point to this instance under configured URL. The EC instance has the following ports open:

  • 4502 for AEM author
  • 4503 for AEM publish
  • 80 for Apache Dispatcher
  • 22 for SSH admin access and Ansible

Security rules are part of the configuration too and allow us to access the environment only from given subnets, VPN company network etc. Terraform scripts can be adopted to support other cloud providers too.

AEM Infrastructure

To manage cloud instance configuration, we use Ansible. Windows system is not supported by Ansible, so if you are forced to use this system then a solution would be Windows Sub-system for Linux (WSL). It allows you to install any Linux distribution and proceed. Like in previous part, all parameters can be configured.

Here is the ansible playbook code snippet to start AEM instances and dispatcher:

- name: GAIA playbook
  hosts: gaia

  tasks:
    # ...

    # --- run AEM
    - name: GAP - run AEM
      shell: |
        ./gradlew environmentHosts -q
        sh env/hosts
        ./gradlew instanceUp
      args:
        chdir: '{{ gap_path }}'

    # --- run Dispatcher
    - name: GAP - run httpd+disapctcher
      shell: |
        ./gradlew environmentUp
      args:
        chdir: '{{ gap_path }}'

where gap_path is any location on the mounted EBS volume (e.g. /data/gaia/gaia-gap).

Then it can be executed by a single command:

ansible-playbook -i inventory startAEM.yaml

You’ll be asked for credentials to be able to download AEM jar and license files from a configured URL. After around 16 minutes AEM is ready to use.

Full Ansible playbook installs all needed tools: wget, git, java11, docker and then creates a workspace, downloads AEM install file and runs Gradle AEM plugin to start AEM instances (author + publish). Finally, it deploys the Dispatcher Apache module inside a Docker container.

Described solution benefits from GAP idempotency nature. Running Ansible playbook again over configured AEM is safe and does not perform already finished tasks again.

End of work

What's great in separating infrastructure and configuration management is ability to stop AEM instance while keeping the environment. In order to achieve that, you may prepare stopAEM.yaml playbook with proper configuration to stop AEM instance and then run:

ansible-playbook -i inventory stopAEM.yaml

When environment is no longer needed, one command stops and removes it completely from AWS.

terraform destroy

Above steps assume that all prerequisites like Ansible and Terraform, required keys and permissions are met.

Summary

The solution is designed for faster and easier management of non-production environments (like DEV, INT) using proven-on-battlefield Gradle AEM Plugin. Entry level of solution based on Terraform and Ansible is more friendly to developers than for example Chef-based solutions which we've been using at Cognifide so far. It is less complex, just needs SSH connection and doesn't require additional configuration servers (like Chef server) because it is agentless. Ansible configuration is based on YAML, so in our opinion it is more human friendly than Chef's Ruby-based configuration.

Ideas for next steps are:

  • Introduce Author Dispatcher
  • Prepare Terraform configuration for Azure, GCP and other cloud providers
  • Prepare Virtual Box with ready-to-use toolset
  • Think about version for more complex environments (more VMs, load balancers etc.)

GAIA solution benefits from environment as a code approach. Provisioning of the infrastructure through versioned definition files and keeping the AEM configuration in one place makes the solution compact and trackable.

GAIA is not suitable to setup complex or production-like environments as it assumes that all AEM components are running on a single virtual machine without redundancy, load balancers etc. Keeping that in mind, GAIA can be a right choice for developers and testers to enable them to conduct various types of tests in any number of isolated environments. Imagine, a complete AEM environment, with code deployed from a given feature branch, spun up on demand in the cloud in less than 20 minutes – yes, it is possible and seems cool!

The table below presents the times of execution of each step, measured for two cases of different AEM source origin.

AEM source AWS setup AEM download (Cognifide box) Instance setup TOTAL (without download)
JAR + license 3 min 2 min 14 min 19 min (17 min)
Backup ZIP 3 min 12 min 4 min 19 min (7 min)