AWS: Use the AWS CLI to delete snapshots from your account

The Amazon EC2 console allows you to delete up to 50 Amazon Elastic Block Store (Amazon EBS) snapshots at once. To delete more than 50 snapshots, use the AWS Command Line Interface (AWS CLI) or the AWS SDK.

To see all the snapshots that you own in a specific region, run the following. Note, replace af-south-1 with your region:

aws ec2 describe-snapshots --owner-ids self  --query 'Snapshots[]' --region af-south-1

Note: To run the code below, first make sure your in the correct account (or life will become difficult for you). Next replace BOTH instances “af-south-1” with your particular region. Finally, you can use a specific account number in place of –owner-ids=self (eg –owner-ids=1234567890).

for SnapshotID in $(aws ec2 --region af-south-1 describe-snapshots --owner-ids=self --query 'Snapshots[*].SnapshotId' --output=text); do
aws ec2 --region af-south-1 delete-snapshot --snapshot-id ${SnapshotID}
done

How to make an offline copy of a static website using wget and hosting on AWS S3 with CloudFront

I have an old website that I want to avoid the hosting costs and so just wanted to download the website and run it from an AWS S3 bucket using Cloud Front to publish the content. Below are the steps I took to do this:

First download the website to your laptop

$ wget \
     --recursive \
     --no-clobber \
     --page-requisites \
     --html-extension \
     --convert-links \
     --no-check-certificate \
     --restrict-file-names=unix \
     --domains archive.andrewbaker.ninja \
     --no-parent \
         http://archive.andrewbaker.ninja/
$ cd archive.andrewbaker.ninja
$ ls

Below is a summary of the parameters (inc common alternatives):

–recursive: Wget is capable of traversing parts of the Web (or a single HTTP or FTP server), following links and directory structure. We refer to this as to recursive retrieval, or recursion.

–no-clobber: If a file is downloaded more than once in the same directory, Wget’s behavior depends on a few options, including `-nc’. In certain cases, the local file will be clobbered, or overwritten, upon repeated download. In other cases it will be preserved. When running Wget without `-N’`-nc’, or `-r’, downloading the same file in the same directory will result in the original copy of file being preserved and the second copy being named `file.1′. If that file is downloaded yet again, the third copy will be named `file.2′, and so on. When `-nc’ is specified, this behavior is suppressed, and Wget will refuse to download newer copies of `file. Therefore, “no-clobber” is actually a misnomer in this mode–it’s not clobbering that’s prevented (as the numeric suffixes were already preventing clobbering), but rather the multiple version saving that’s prevented. When running Wget with `-r’, but without `-N’ or `-nc’, re-downloading a file will result in the new copy simply overwriting the old. Adding `-nc’ will prevent this behavior, instead causing the original version to be preserved and any newer copies on the server to be ignored. When running Wget with `-N’, with or without `-r’, the decision as to whether or not to download a newer copy of a file depends on the local and remote timestamp and size of the file (see section Time-Stamping). `-nc’ may not be specified at the same time as `-N’. Note that when `-nc’ is specified, files with the suffixes `.html’ or (yuck) `.htm’ will be loaded from the local disk and parsed as if they had been retrieved from the Web.

–page-requisites: This causes wget to download all the files that are necessary to properly display a given HTML page which includes images, css, js, etc. –adjust-extension Preserves proper file extensions for . html, . css, and other assets

–html-extension: This adds .html after the downloaded filename, to make sure it plays nicely on whatever system you’re going to view the archive on

–convert-links: After the download is complete, convert the links in the document to make them suitable for local viewing. This affects not only the visible hyperlinks, but any part of the document that links to external content, such as embedded images, links to style sheets, hyperlinks to non-HTML content, etc.

–no-check-certificate: Don’t check the server certificate against the available certificate authorities. Also don’t require the URL host name to match the common name presented by the certificate.

–restrict-file-names: By default, Wget escapes the characters that are not valid or safe as part of file names on your operating system, as well as control characters that are typically unprintable. This option is useful for changing these defaults, perhaps because you are downloading to a non-native partition”. So unless you are not downloading to non-native partition you do not need to restrict file names by OS. its automatic. Additionally: “The values ‘unix’ and ‘windows’ are mutually exclusive (one will override the other)”

–domains: Limit spanning to specified domains

–no-parent: If you don’t want wget to descend down to the parent directory, use -np or –no-parent option. This instructs wget not to ascend to the parent directory when it hits references like ../ in href links.

Upload Files to S3 Bucket

Next upload the files to your S3 bucket. First move into the relevant bucket, then perform the recursive upload.

$ cd archive.andrewbaker.ninja
$ ls .
$ aws s3 cp . s3://vbusers.com/ --recursive

Create a CloudFront Distribution from an S3 Bucket

Finally go to CloudFront and create a distribution from the S3 Bucket you just created. You can pretty much use the default settings. Note: you will need to wait a few minutes before you browse to the distributions domain name:

AWS: Install and configure the AWS CLI on a Macbook

You can absolutely get the following from the AWS help pages; but this is the lazy way to get everything you need for a simple single account setup.

Run the two commands below to drop the package on your Mac.

$ curl "https://awscli.amazonaws.com/AWSCLIV2.pkg" -o "AWSCLIV2.pkg"
$ sudo installer -pkg AWSCLIV2.pkg -target /

Then check the versions you have installed:

$ which aws
$ aws --version

Next you need to setup your environment. Note: This is NOT the recommended way (as it uses long term credentials).

The following example configures a default profile using sample values. Replace them with your own values as described in the following sections.

$ aws configure
AWS Access Key ID [None]: AKIAIOSFODNN7EXAMPLE
AWS Secret Access Key [None]: secretaccesskey
Default region name [None]: af-south-1
Default output format [None]: json

You can also use named profiles. The following example configures a profile named userprod using sample values. Replace them with your own values as described in the following sections.

$ aws configure --profile userprod
AWS Access Key ID [None]: AKIAIOSFODNN7EXAMPLE
AWS Secret Access Key [None]: secretaccesskey
Default region name [None]: af-south-1
Default output format [None]: json

Get your access keys

  1. Sign in to the AWS Management Console and open the IAM console at https://console.aws.amazon.com/iam/.
  2. In the navigation pane of the IAM console, select Users and then select the User name of the user that you created previously.
  3. On the user’s page, select the Security credentials page. Then, under Access keys, select Create access key.
  4. For Create access key Step 1, choose Command Line Interface (CLI).
  5. For Create access key Step 2, enter an optional tag and select Next.
  6. For Create access key Step 3, select Download .csv file to save a .csv file with your IAM user’s access key and secret access key. You need this information for later.
  7. Select Done.

AWS: Automatically Stop and Start your EC2 Services

Below is a quick (am busy) outline on how to automatically stop and start your EC2 instances.

Step 1: Tag your resources

In order to decide which instances stop and start you first need to add an auto-start-stop: Yes tag to all the instances you want to be affected by the start / stop functions. Note: You can use “Resource Groups and Tag Editor” to bulk apply these tags to the resources you want to be affected by the lambda functions you are going to create. See below (click the orange button called “Manage tags of Selected Resources”).

Step 2: Create a new role for our lambda functions

First we need to create the IAM role to run the Lambda functions. Go to IAM and click the “Create Role” button. Then select “AWS Service” from the “Trusted entity options”, and select Lambda from the “Use Cases” options. Then click “Next”, followed by “Create Policy”. To specify the permission, simply Click the JSON button on the right of the screen and enter the below policy (swapping the region and account id for your region and account id):

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeInstances",
                "ec2:StartInstances",
                "ec2:DescribeTags",
                "logs:*",
                "ec2:DescribeInstanceTypes",
                "ec2:StopInstances",
                "ec2:DescribeInstanceStatus"
            ],
            "Resource": "arn:aws:ec2:<region>:<accountID>:instance/*",
            "Condition": {
                "StringEquals": {
                    "aws:ResourceTag/auto-start-stop": "Yes"
                }
            }
        }
    ]
}

Hit next and under “Review and create”, save the above policy as ec2-lambda-start-stop by clicking the “Create Policy” button. Next, search for this newly created policy and select it as per below and hit “Next”.

You will now see the “Name, review, and create” screen. Here you simply need to hit “Create Role” after you enter the role name as ec2-lambda-start-stop-role.

Note the policy is restricted to only have access to EC2 instances that contains auto-start-stop: Yes tags (least privileges).

If you want to review your role, this is how it should look. You can see I have filled in my region and account number in the policy:

Step 3: Create Lambda Functions To Start/Stop EC2 Instances

In this section we will create two lambda functions, one to start the instances and the other to stop the instances.

Step 3a: Add the Stop EC2 instance function

  • Goto Lambda console and click on create function
  • Create a lambda function with a function name of stop-ec2-instance-lambda, python3.11 runtime, and ec2-lambda-stop-start-role (see image below).

Next add the lamdba stop function and save it as stop-ec2-instance. Note, you will need to change the value of the region_name parameter accordingly.

import json
import boto3

ec2 = boto3.resource('ec2', region_name='af-south-1')
def lambda_handler(event, context):
   instances = ec2.instances.filter(Filters=[{'Name': 'instance-state-name', 'Values': ['running']},{'Name': 'tag:auto-start-stop','Values':['Yes']}])
   for instance in instances:
       id=instance.id
       ec2.instances.filter(InstanceIds=[id]).stop()
       print("Instance ID is stopped:- "+instance.id)
   return "success"

This is how your Lambda function should look:

Step 3b: Add the Start EC2 instance function

  • Goto Lambda console and click on create function
  • Create lambda functions with start-ec2-instance, python3.11 runtime, and ec2-lambda-stop-start-role.
  • Then add the below code and save the function as start-ec2-instance-lambda.

Note, you will need to change the value of the region_name parameter accordingly.

import json
import boto3

ec2 = boto3.resource('ec2', region_name='af-south-1')
def lambda_handler(event, context):
   instances = ec2.instances.filter(Filters=[{'Name': 'instance-state-name', 'Values': ['stopped']},{'Name': 'tag:auto-start-stop','Values':['Yes']}])
   for instance in instances:
       id=instance.id
       ec2.instances.filter(InstanceIds=[id]).stop()
       print("Instance ID is stopped:- "+instance.id)
   return "success"

4. Summary

If either of the above lambda functions are triggered, they will start or stop your EC2 instances based on the instance state and the value of auto-start-stop tag. To automate this you can simply setup up cron jobs, step functions, AWS Event Bridge, Jenkins etc.

Linux: Automatically renew your certs for a wordpress site using letsencrypt

If you want to automatically renew your certs then the easiest way is to setup a cron just to call letsencrypt periodically. Below is an example cron job:

First create the bash script to renew the certificate

$ pwd
/home/bitnami
$ sudo nano renew-certificate.sh

Now enter the script in the following format into nano:

#!/bin/bash

sudo /opt/bitnami/ctlscript.sh stop apache
sudo /opt/bitnami/letsencrypt/lego --path /opt/bitnami/letsencrypt --email="myemail@myemail.com" --http --http-timeout 30 --http.webroot /opt/bitnami/apps/letsencrypt --domains=andrewbaker.ninja renew --days 90
sudo /opt/bitnami/ctlscript.sh start apache

Now edit the crontab to run the renew script:

$ crontab -e
0 0 * * * sudo /home/bitnami/renew-certificate.sh 2> /dev/null

Macbook/Linux: Secure Copy from your local machine to an EC2 instance

I always forget the syntax of SCP and so this is a short article with a simple example of how to SCP a file from your laptop to your EC2 instance and how to copy it back from EC2 to your laptop:

Copying from Laptop to EC2

scp -i "mylocalpemfile.pem" mylocalfile.zip ec2-user@myEc2DnsOrIpAdress:/home/mydestinationfolder

scp -i identity_file.pem source_file.extention username@public_ipv4_dns:/remote_path

scp: Secure copy protocol
-i: Identity file
source_file.extension: The file that you want to copy
username: Username of the remote system (ubuntu for Ubuntu, ec2-user for Linux AMI or bitnami for wordpress)
public_ipv4_dns: DNS/IPv4 address of an instance
remote_path: Destination path

Copying from EC2 to your Laptop

scp -i "mylocalpemfile.pem" ec2-user@myEc2DnsOrIpAdress:/home/myEc2Folder/myfile.zip /Users/accountNmae/Dow
nloads
  • scp -i identity_file.pem username@public_ipv4_dns:/remote_path/source_file.extension ~/destination_local_path
Ex: scp -i access.pem bitnami@0.0.0.0:/home/bitnami/temp.txt ~/Documents/destination_dir

How to Backup your MySql database on a bitnami wordpress site

I recently managed to explode my wordpress site (whilst trying to upgrade PHP). Anyway, luckily I had created an AMI a month ago – but I had written a few articles since then and so wanted to avoid rewriting them. So below is a method to create a backup of your wordpress mysql database to S3 and recover it onto a new wordpress server. Note: I actually mounted the corrupt instance as a volume and did this the long way around.

Step 1: Create an S3 bucket to store the backup

$ aws s3api create-bucket \
>     --bucket andrewbakerninjabackupdb \
>     --region af-south-1 \
>     --create-bucket-configuration LocationConstraint=af-south-1
Unable to locate credentials. You can configure credentials by running "aws configure".
$ aws configure
AWS Access Key ID [None]: XXXXX
AWS Secret Access Key [None]: XXXX
Default region name [None]: af-south-1
Default output format [None]: 
$ aws s3api create-bucket     --bucket andrewbakerninjabackupdb     --region af-south-1     --create-bucket-configuration LocationConstraint=af-south-1
{
    "Location": "http://andrewbakerninjabackupdb.s3.amazonaws.com/"
}
$ 

Note: To get your API credentials simply go to IAM, Select the Users tab and then Select Create Access Key

Step 2: Create a backup of your MsSql database and copy it to S3

For full backups follow the below script (note: this wont be restorable across mysql versions as it will include the system “mysql” db)

# Check mysql is install/version (note you cannot restore across versions)
mysql --version
# First get your mysql credentials
sudo cat /home/bitnami/bitnami_credentials
Welcome to the Bitnami WordPress Stack

******************************************************************************
The default username and password is XXXXXXX.
******************************************************************************

You can also use this password to access the databases and any other component the stack includes.

# Now create a backup using this password
$ mysqldump -A -u root -p > backupajb.sql
Enter password: 
$ ls -ltr
total 3560
lrwxrwxrwx 1 bitnami bitnami      17 Jun 15  2020 apps -> /opt/bitnami/apps
lrwxrwxrwx 1 bitnami bitnami      27 Jun 15  2020 htdocs -> /opt/bitnami/apache2/htdocs
lrwxrwxrwx 1 bitnami bitnami      12 Jun 15  2020 stack -> /opt/bitnami
-rw------- 1 bitnami bitnami      13 Nov 18  2020 bitnami_application_password
-r-------- 1 bitnami bitnami     424 Aug 25 14:08 bitnami_credentials
-rw-r--r-- 1 bitnami bitnami 3635504 Aug 26 07:24 backupajb.sql

# Next copy the file to your S3 bucket
$ aws s3 cp backupajb.sql s3://andrewbakerninjabackupdb
upload: ./backupajb.sql to s3://andrewbakerninjabackupdb/backupajb.sql
# Check the file is there
$ aws s3 ls s3://andrewbakerninjabackupdb
2022-08-26 07:27:09    3635504 backupajb.sql

OR for partial backups, follow the below to just backup the bitnami wordpress database:

# Login to database
mysql -u root -p
show databases;
+--------------------+
| Database           |
+--------------------+
| bitnami_wordpress  |
| information_schema |
| mysql              |
| performance_schema |
| sys                |
+--------------------+
exit
$ mysqldump -u root -p --databases bitnami_wordpress > backupajblight.sql
Enter password: 
$ ls -ltr
total 3560
lrwxrwxrwx 1 bitnami bitnami      17 Jun 15  2020 apps -> /opt/bitnami/apps
lrwxrwxrwx 1 bitnami bitnami      27 Jun 15  2020 htdocs -> /opt/bitnami/apache2/htdocs
lrwxrwxrwx 1 bitnami bitnami      12 Jun 15  2020 stack -> /opt/bitnami
-rw------- 1 bitnami bitnami      13 Nov 18  2020 bitnami_application_password
-r-------- 1 bitnami bitnami     424 Aug 25 14:08 bitnami_credentials
-rw-r--r-- 1 bitnami bitnami 2635204 Aug 26 07:24 backupajblight.sql
# Next copy the file to your S3 bucket
$ aws s3 cp backupajblight.sql s3://andrewbakerninjabackupdb
upload: ./backupajblight.sql to s3://andrewbakerninjabackupdb/backupajblight.sql
# Check the file is there
$ aws s3 ls s3://andrewbakerninjabackupdb
2022-08-26 07:27:09    2635204 backupajblight.sql

Step 3: Restore the file on your new wordpress server

Note: If you need the password, use the cat command from Step 2.

#Copy the file down from S3
$ aws s3 cp s3://andrewbakerninjabackupdb/backupajbcron.sql restoreajb.sql --region af-south-1
#Restore the db
$ mysql -u root -p < restoreajb.sql

Step 4: Optional – Automate the Backups using Cron and S3 Versioning

This part is unnecessary (and one could credibly argue that AWS Backup is the way to go – but am not a fan of its clunky UI). Below I enable S3 versioning and create a Cron job to backup the database every week. I will also set the S3 lifecycle policy to delete anything older than 90 days.

# Enable bucket versioning
aws s3api put-bucket-versioning --bucket andrewbakerninjabackupdb --versioning-configuration Status=Enabled
# Now set the bucket lifecycle policy
nano lifecycle.json 

Now paste the following policy into nano and save it (as lifecycle.json):

{
    "Rules": [
        {
            "Prefix": "",
            "Status": "Enabled",
            "Expiration": {
                "Days": 90
            },
            "ID": "NinetyDays"
        }
    ]
}

Next add the lifecycle policy to delete anything older than 90 days (as per above policy):

aws s3api put-bucket-lifecycle --bucket andrewbakerninjabackupdb --lifecycle-configuration file://lifecycle.json
## View the policy
aws s3api get-bucket-lifecycle-configuration --bucket andrewbakerninjabackupdb

Now add a CronJob to run every week:

## List the cron jobs
crontab -l
## Edit the cron jobs
crontab -e
## Enter these lines. 
## Backup on weds at 12:00 and copy it to S3 at 1am (cron format: min hour day month weekday (sunday is day zero))
1 0 * * SAT /opt/bitnami/mysql/bin/mysqldump -A -uroot -pPASSWORD > backupajbcron.sql
1 2 * * SAT /opt/bitnami/mysql/bin/mysqldump -u root -pPASSWORD --databases bitnami_wordpress > backupajbcronlight.sql
0 3 * * SAT aws s3 cp backupajbcron.sql s3://andrewbakerninjabackupdb
0 4 * * SAT aws s3 cp backupajbcronlight.sql s3://andrewbakerninjabackupdb

How to trigger Scaling Events using Stress-ng Command

If you are testing how your autoscaling policies respond to CPU load then a really simple way to test this is using the “stress” command. Note: this is a very crude mechanism to test and wherever possible you should try and generate synthetic application load.

#!/bin/bash

# DESCRIPTION: After updating from the repo, installs stress-ng, a tool used to create various system load for testing purposes.
yum update -y
# Install stress-ng
sudo apt install stress-ng

# CPU spike: Run a CPU spike for 5 seconds
uptime
stress-ng --cpu 4 --timeout 5s --metrics-brief
uptime

# Disk Test: Start N (2) workers continually writing, reading and removing temporary files:
stress-ng --disk 2 --timeout 5s --metrics-brief

# Memory stress test
# Populate memory. Use mmap N bytes per vm worker, the default is 256MB. 
# You can also specify the size as % of total available memory or in units of 
# Bytes, KBytes, MBytes and GBytes using the suffix b, k, m or g:
# Note: The --vm 2 will start N workers (2 workers) continuously calling 
# mmap/munmap and writing to the allocated memory. Note that this can cause 
# systems to trip the kernel OOM killer on Linux systems if not enough 
# physical memory and swap is not available
stress-ng --vm 2 --vm-bytes 1G --timeout 5s

# Combination Stress
# To run for 5 seconds with 4 cpu stressors, 2 io stressors and 1 vm 
# stressor using 1GB of virtual memory, enter:
stress-ng --cpu 4 --io 2 --vm 1 --vm-bytes 1G --timeout 5s --metrics-brief

AWS: Making use of S3s ETags to check if a file has been altered

I was playing with S3 the other day an I noticed that a file which I had uploaded twice, in two different locations had an identical ETag. This immediately made me think that this tag was some kind of hash. So I had a quick look AWS documentation and this ETag turns out to be marginally useful. ETag is an “Entity Tag” and its basically a MD5 hash of the file (although once the file is bigger than 5gb it appears to use another hashing algorithm).

So if you ever want to compare a local copy of a file with an AWS S3 copy of a file you just need to install MD5 (the below steps are for ubuntu linux):

# Update your ubunto
# Download the latest package lists
sudo apt update
# Perform the upgrade
sudo apt-get upgrade -y
# Now install common utils (inc MD5)
sudo apt install -y ucommon-utils
# Upgrades involving the Linux kernel, changing dependencies, adding / removing new packages etc
sudo apt-get dist-upgrade

Next to view the MD5 hash of a file simple type:

# View MD5 hash of
md5sum myfilename.myextension
2aa318899bdf388488656c46127bd814  myfilename.myextension
# The first number above will match your S3 Etag if its not been altered

Below is the screenshot of the properties that you will see in S3 with a matching MD5 hash:

Using TPC-H tools to Create Test Data for AWS Redshift and AWS EMR

If you need to test out your big data tools below is a useful set of scripts that I have used in the past for aws emr and redshift the below might be helpful:

install git
 sudo yum install make git -y
 install the tpch-kit
 git clone https://github.com/gregrahn/tpch-kit
 cd tpch-kit/dbgen
 sudo yum install gcc -y
 Compile the tpch kit
 make OS=LINUX
 Go home
 cd ~
 Now make your emr data
 mkdir emrdata
 Tell tcph to use the this dir
 export DSS_PATH=$HOME/emrdata
 cd tpch-kit/dbgen
 Now run dbgen in verbose mode, with tables (orders), 10gb data size
 ./dbgen -v -T o -s 10
 move the data to a s3 bucket
 cd $HOME/emrdata
 aws s3api create-bucket -- bucket andrewbakerbigdata --region af-south-1 --LocationConstraint=af-south-1
 aws s3 cp $HOME/emrdata s3://andrewbakerbigdata/emrdata --recursive
 cd $HOME
 mkdir redshiftdata
 Tell tcph to use the this dir
 export DSS_PATH=$HOME/redshiftdata
 Now make your redshift data
 cd tpch-kit/dbgen
 Now run dbgen in verbose mode, with tables (orders), 40gb data size
 ./dbgen -v -T o -s 40
 These are big files, so lets find out how big they are and split them
 Count lines
 cd $HOME/redshiftdata
 wc -l orders.tbl
 Now split orders into 15m lines per file
 split -d -l 15000000 -a 4 orders.tbl orders.tbl.
 Now split line items
 wc -l lineitem.tbl
 split -d -l 60000000 -a 4 lineitem.tbl lineitem.tbl.
 Now clean up the master files
 rm orders.tbl
 rm lineitem.tbl
 move the split data to a s3 bucket
 aws s3 cp $HOME/redshiftdata s3://andrewbakerbigdata/redshiftdata --recursive