Windows: Delete the Icon Cache

I remember getting weird flashing on my laptop and eventually figured out my icon cache was full. So if you ever get this, try running the script below. This is obviously quite a weird/random post – hope its helpful 🙂

cd /d %userprofile%\AppData\Local\Microsoft\Windows\Explorer 
attrib –h iconcache_*.db 
del iconcache_*.db 
start explorer
pause

The Least Privileged Lie

In technology, there is a tendency to solve a problem badly by using gross simplification, then come up with a catchy one liner and then broadcast this as doctrine or a principle. Nothing ticks more boxes in this regard, than the principle of least privileges. The ensuing enterprise scale deadlocks created by a crippling implementation of least privileges, is almost certainly lost on its evangelists. This blog will try to put an end to the slavish efforts of many security teams that are trying to ration out micro permissions and hope the digital revolution can fit into some break glass approval process.

What is this “Least Privileged” thing? Why does it exist? What are the alternatives? Wikipedia gives you a good overview of this here. The first line contains an obvious and glaring issue: “The principle means giving a user account or process only those privileges which are essential to perform its intended function”. Here the principle is being applied equally to users and processes/code. The principle also states only give privileges that are essential. What this principle is trying to say, is that we should treat human beings and code as the same thing and that we should only give humans “essential” permissions. Firstly, who on earth figures out what that bar for essential is and how do they ascertain what is and what is not essential? Do you really need to use storage? Do you really need an API? If I give you an API, do you need Puts and Gets?

Human beings are NOT deterministic. If I have a team of humans that can operate under the principle of least privileges then I don’t need them in the first place. I can simply replace them with some AI/RPA. Imagine the brutal pain of a break glass activity every time someone needed to do something “unexpected”. “Hi boss, I need to use the bathroom on the 1st floor – can you approve this? <Gulp> Boss you took too long… I no longer need your approval!”. Applying least privileges to code would seem to make some sense; BUT only if you never updated the code and if did update the code you need to make sure you have 100px test coverage.

So why did some bright spark want to duck tape the world to such a brittle pain yielding principle? At the heart of this are three issues. Identity, Immutability, and Trust. If there are other ways to solve these issues then we don’t need to pain and risks of trying to implement something that will never actually work, creates friction and critically creates a false sense of security. Least Privileges will never save anyone, you will just be told that if you could have performed this security miracle then you would have been fine. But you cannot and so you are not.

Whats interesting to me is that the least privileged lie is so widely ignored. For example, just think about how we implement user access. If we truly believed in least privileges then every user would have a unique set of privileges assigned to them. Instead, because we acknowledge this is burdensome we approximate the privileges that a user will need using policies which we attach to groups. The moment we add a user to one of these groups, we are approximating their required privileges and start to become overly permissive.

Lets be clear with each other, anyone trying to implement least privileges is living a lie. The extent of the lie normally only becomes clear after the event. So this blog post is designed to re-point energy towards sustainable alternatives that work, and additionally remove the need for the myriad of micro permissive handbrakes (that routinely get switched off to debug outages and issues).

Who are you?

This is the biggest issue and still remains the largest risk in technology today. If I don’t know who you are then I really really want to limit what you can do. Experiencing a root/super user account take over, is a doomsday scenario for any organisation. So lets limit the blast zone of these accounts right?

This applies equally to code and humans. For code this problem has been solved a long time ago, and if you look

Is this really my code?

AWS: Making use of S3s ETags to check if a file has been altered

I was playing with S3 the other day an I noticed that a file which I had uploaded twice, in two different locations had an identical ETag. This immediately made me think that this tag was some kind of hash. So I had a quick look AWS documentation and this ETag turns out to be marginally useful. ETag is an “Entity Tag” and its basically a MD5 hash of the file (although once the file is bigger than 5gb it appears to use another hashing algorithm).

So if you ever want to compare a local copy of a file with an AWS S3 copy of a file you just need to install MD5 (the below steps are for ubuntu linux):

# Update your ubunto
# Download the latest package lists
sudo apt update
# Perform the upgrade
sudo apt-get upgrade -y
# Now install common utils (inc MD5)
sudo apt install -y ucommon-utils
# Upgrades involving the Linux kernel, changing dependencies, adding / removing new packages etc
sudo apt-get dist-upgrade

Next to view the MD5 hash of a file simple type:

# View MD5 hash of
md5sum myfilename.myextension
2aa318899bdf388488656c46127bd814  myfilename.myextension
# The first number above will match your S3 Etag if its not been altered

Below is the screenshot of the properties that you will see in S3 with a matching MD5 hash:

Using TPC-H tools to Create Test Data for AWS Redshift and AWS EMR

If you need to test out your big data tools below is a useful set of scripts that I have used in the past for aws emr and redshift the below might be helpful:

install git
 sudo yum install make git -y
 install the tpch-kit
 git clone https://github.com/gregrahn/tpch-kit
 cd tpch-kit/dbgen
 sudo yum install gcc -y
 Compile the tpch kit
 make OS=LINUX
 Go home
 cd ~
 Now make your emr data
 mkdir emrdata
 Tell tcph to use the this dir
 export DSS_PATH=$HOME/emrdata
 cd tpch-kit/dbgen
 Now run dbgen in verbose mode, with tables (orders), 10gb data size
 ./dbgen -v -T o -s 10
 move the data to a s3 bucket
 cd $HOME/emrdata
 aws s3api create-bucket -- bucket andrewbakerbigdata --region af-south-1 --LocationConstraint=af-south-1
 aws s3 cp $HOME/emrdata s3://andrewbakerbigdata/emrdata --recursive
 cd $HOME
 mkdir redshiftdata
 Tell tcph to use the this dir
 export DSS_PATH=$HOME/redshiftdata
 Now make your redshift data
 cd tpch-kit/dbgen
 Now run dbgen in verbose mode, with tables (orders), 40gb data size
 ./dbgen -v -T o -s 40
 These are big files, so lets find out how big they are and split them
 Count lines
 cd $HOME/redshiftdata
 wc -l orders.tbl
 Now split orders into 15m lines per file
 split -d -l 15000000 -a 4 orders.tbl orders.tbl.
 Now split line items
 wc -l lineitem.tbl
 split -d -l 60000000 -a 4 lineitem.tbl lineitem.tbl.
 Now clean up the master files
 rm orders.tbl
 rm lineitem.tbl
 move the split data to a s3 bucket
 aws s3 cp $HOME/redshiftdata s3://andrewbakerbigdata/redshiftdata --recursive

Setting up ssh for ec2-user to your wordpress sites

So after getting frustrated (and even recreating my ec2 instances) due to a “Permission denied (publickey)”, I finally released that the worpress builds by default as set up for SSH using the bitnami account (or at least my build was).

This means each time I login using ec2-user I get:

sudo ssh -i CPT_Default_Key.pem ec2-user@ec2-13-244-140-33.af-south-1.compute.amazonaws.com
ec2-user@ec2-13-244-140-33.af-south-1.compute.amazonaws.com: Permission denied (publickey).

Being a limited human being, I will never cope with two user names. So to move over to a standard login name (ec2-user) is relatively simple. Just follow the below steps (after logging in using the bitnami account):

sudo useradd -s /bin/bash -o -u id -u -g id -g ec2-user

sudo mkdir ~ec2-user/
sudo cp -rp ~bitnami/.ssh ~ec2-user/
sudo cp -rp ~bitnami/.bashrc ~ec2-user/
sudo cp -rp ~bitnami/.profile ~ec2-user/

Next you need to copy your public key into the authorised keys file using:

cat mypublickey.pub >> /home/ec2-user/.ssh/authorized_key

Next to allow the ec2-user to execute commands as the root user, add the new user account to the bitnami-admins group, by executing the following command when logged in as the bitnami user:

sudo usermod -aG bitnami-admins ec2-user

Linux: Quick guide to the CD command – for windows dudes :)

Ok, so I am a windows dude and only after docker and K8 came along did I start to get all they hype around Linux. To be fair, Linux is special and I have been blown away with the engineering effort behind this OS (and also glad to leave my book of Daniel Appleman win32 api on the shelf for a few years!).

What surprises me with Linux is the number of shortcuts and so before I forget them I am going to document a few of my favorites (the context here is that I use WSL2 a lot and these are my favorite navigation commands).

Exchanging files between Linux and Windows:

This is a bit of a pain, so I just create a symbolic link to a windows root directory in my linux home directory so that I can easily copy files back an forth.

cd ~
ln -s /mnt/c/ mywindowsroot
cd mywindowsroot
ls
# copy everything from my windows root folder into my wsl linux directory
cp mywindowsroot/. .

Show Previous Directory

cd --

Switch back to your previous directory

cd -

Move to Home Directory

cd ~
or just use
cd

Pushing and Popping Directories

Pushd and popd are Linux commands in bash and certain other shell which saves current working directory location to memory or brings to the directory from memory and changes to this directory, respectively. This is very handy when your jumping around but don’t want to create symbolic links.

# Push the current directory onto the stack (you can also enter an absolute directory here, like pushd /var/www)
pushd .
# Go to the home dir
cd
ls
# Now move back to this directory
popd
ls

AWS: Please Fix Poor Error Messages, API standards and Bad Defaulting

3 Ways to Handle Work Frustration (Without Quitting) | Business Markets and  Stocks News | madison.com

This is a short blog, and its actually just simple a plea to AWS. Please can you do three things?

  1. North Virginia appears to be the AWS master node. Having this region as a master region causes a large number of support issues (for example S3, KMS, Cloudfront, ACM all use this pet region and all of their APIs suffer as a result). This coupled with point 2) creates some material angst.
  2. Work a little harder on your error messages – they are often really (really) bad. I will post some examples at the bottom of this post over time. But you have to do some basics like reject unknown parameters (yes it’s useful to know there is a typo vs just ignore the parameter).
  3. Use standard parameters across your APIs (eg make specifying the region consistent (even within single products its not consistently applied) and make your verbs consistent).

As a simple example, below i am logged into an EC2 instances in af-south-1 and I can create an S3 bucket in North Virginia, but not in af-south-1. I am sure there is a “fix” (change some config, find out an API parameter was invalid and was silently ignored etc) – but this isn’t the point. The risk (and its real) is that in an attempt to debug this, developers will tend to open up security groups, open up NACLs, widen IAM roles etc. When the devs finally fix the issue; they will be very unlikely to retrace all their steps and restore everything else that they changed. This means that you end up with debugging scars that create overly permission services, due to poor errors messages, inconsistent API parameters/behaviors and a regional bias. Note: I am aware of commercial products, like Radware’s CWP – but that’s not the point. I shouldn’t ever need to debug by dialling back security. Observability was supposed to be there from day 1. The combination of tangential error messages, inconsistent APIs and lack of decent debug information from core services like IAM and S3, are creating a problem that shouldn’t exist.

AWS is a global cloud provider – services should work identically across all regions, and APIs should have standards, APIs shouldn’t silently ignore mistyped parameters, the base config required should either come from context (ie am running in region x) or config (aws config) – not a global default region.

Please note: I deleted the bucket between running the two commands ++ awsconfigure seemed to be ignored by createbucket

[ec2-user@ip-172-31-24-139 emrdata]$ aws s3api create-bucket --bucket ajbbigdatabucketlab2021
{
    "Location": "/ajbbigdatabucketlab2021"
}
[ec2-user@ip-172-31-24-139 emrdata]$ aws s3api create-bucket --bucket ajbbigdatabucketlab2021 --region af-south-1

An error occurred (IllegalLocationConstraintException) when calling the CreateBucket operation: 
The unspecified location constraint is incompatible for the region specific endpoint this request was sent to.

Note, I worked around the createbucket behavior by replacing it with mb:

[ec2-user@ip-172-31-24-139 emrdata]$ aws s3 mb s3://ajbbigdatabucketlab2021 --region af-south-1
make_bucket: ajbbigdatabucketlab2021

Thanks to the AWS dudes for letting me know how to get this working. It turns out the create-bucket and mb APIs, dont use standard parameters. See below (region tag needs to be replaced by a verbose bucket config tag):

aws s3api create-bucket --bucket ajbbigdatabucketlab2021 --create-bucket-configuration LocationConstraint=af-south-1

A simple DDOS SYN flood Test

Getting an application knocked out with a simple SYN flood is both embarrassing and avoidable. Its also very easy to create a SYN flood and so its something you should design against. Below is the hping3 command line that I use to test my services against SYN floods. I have used quite a few mods, to make the test a bit more realistic – but you can also distribute this across a few machines to stretch the target host a bit more if you want to.

Parameters:

-c –count Stop after sending (and receiving) count response packets. After the last packet was sent, hping3 wait COUNTREACHED_TIMEOUT seconds target host replies. You are able to tune COUNTREACHED_TIMEOUT editing hping3.h

-d –data data size Set packet body size. Warning, using –data 40 hping3 will not generate 0 byte packets but protocol_header+40 bytes. hping3 will display packet size information as first line output, like this: HPING www.yahoo.com (ppp0 204.71.200.67): NO FLAGS are set, 40 headers + 40 data bytes

-S –syn Set SYN tcp flag

-w –win Set TCP window size. Default is 64.

-p –destport [+][+]dest port Set destination port, default is 0. If ‘+’ character precedes dest port number (i.e. +1024) destination port will be increased for each reply received. If double ‘+’ precedes dest port number (i.e. ++1024), destination port will be increased for each packet sent. By default destination port can be modified interactively using CTRL+z.

–flood send packets as fast as possible, without waiting for incoming replies. This is faster than the -i u0 option.

–rand-source This option enables the random source mode. hping will send packets with random source address. It is interesting to use this option to stress firewall state tables, and other per-ip basis dynamic tables inside the TCP/IP stacks and firewall software.

apt-get update
apt install hping3
hping3 -c 15000 -d 120 -S -w 64 -p 443 --flood --rand-source <my-ip-to-test>

The Triplication Paradigm

Biggest Wastes of Money (Part 5): Gadgets, Dining Out, Luxury Hotels, Gyms
Wasting money can often happen when you think your being clever…

Introduction

In most large corporates technology will typically report into either finance or operations. This means that it will tend to be subject to cultural inheritance, which is not always a good thing. One example of where the cultural default should be challenged is when managing IP duplication. In finance or operations duplication rarely yields any benefits and will often result in unnecessary costs and/or inconsistent customer experiences. Because of this, technology teams will tend to be asked to centrally analyse all incoming workstreams for convergence opportunities. If any seemingly overlapping effort is discovered, this would then typically be extracted into a central, “do it once” team. Experienced technologists will likely remark that it generally turns out that the analysis process is very slow, overlaps are small, the cost of extracting them are high, additional complexity is introduced, backlogs become unmanageable, testing the consolidated “swiss army knife” product is problematic and critically, the teams are typically reduced to crawling speed as they try to transport context and requirements to the central delivery team. I have called the above process “Triplication”, simply because is creates more waste and costs more than duplication ever could (and also because my finance colleagues seem to connect with this term).

The article below attempts to explain why we fear duplication and why slavishly trying to remove all duplication is a mistake. Having said this, a purely federated model or abundant resource model with no collaboration leads to similarly chronic issues (I will write an article about “Federated Strangulation” shortly).

The Three Big Corporate Fears

  1. The fear of doing something badly.
  2. The fear of doing something twice (duplication).
  3. The fear of doing nothing at all.

In my experience, most corporates focus on fear 1) and 2). They will typically focus on layers of governance, contractual bindings, interlocks and magic metric tracking (SLA, OLA, KPI etc etc). The governance is typically multi-layered, with each forum meeting infrequently and ingesting the data in a unique format (no sense in not duplicating the governance overhead, right?!). As a result these large corporates typically achieve fear 3) – they will do nothing at all.

Most start-ups/tech companies worry almost exclusively about 3) – as a result they achieve a bit of 1) and 2). Control is highly federated, decision trees are short, and teams are self empowered and self organising. Dead ends are found quickly, bad ideas are cancelled or remediated as the work progresses. Given my rather bias narrative about – it won’t be a surprise to learn that I believe 3) is the greatest of all evils. To allow yourself to be overtaken is the greatest of all evils, to watch a race that you should be running is the most extreme form a failure.

For me, managed duplication can be a positive thing. But the key is that you have to manage it properly. You will often see divergence and consolidation in equal measure as the various work streams mature. The key to managing duplication is to enforce scarcity of resources and collaboration. Additionally, you may find that a decentralised team could become conflicted when it is asked to manage multiple business units interests. This is actually success! This means this team has created something that has been virally absorbed by other parts of the business – it means you have created something thats actually good! When this happens look at your contribution options, and sometimes it may make sense to split the product team up into a several business facing teams and a core platform engineering team. If however, there is no collaboration and an abundance of resources are thrown at all problems, you end up with material and avoidable waste. Additionally, observe exactly what your duplicating – never duplicate a commodity and never federate data. You also need to avoid a snowflake culture and make sure that were it makes sense you are trying to share.

Triplication happens when a two or more products are misunderstood to be “similar” and then attempted to be fused together. The over aggregation of your product development streams will yield most of the below:

1) Cripplingly slow and expensive to develop.

2) Risk concentration/instability. Every release will cause trauma to multiple customer bases.

3) Unsupportable. It will take you days to work out what went wrong and how on earth you can fix the issue as you will suffer from Quantum Entanglement.

4) Untestable. The complexity of the product will guarantee each release causes distress.

5) Low grade client experience.

Initially these problems will be described as “teething problems”. After a while it becomes clearer that the problem is not fixing itself. Next you will likely start the “stability” projects. A year or so later after the next pile of cash is burnt there will be a realisation that this is as good as it gets. At this point, senior managers start to see the writing on the wall and will quickly distance themselves from the product. Luckily for them, nobody will likely remember exactly whom in the many approval forums thought this was a good idea in the first place. Next the product starts to get linked to the term “legacy”. The final chapter for this violation of common sense, is the multi-year decommissioning process. BUT – its highly likely that the strategic replacement contains the exact same flaws as the legacy product…

The Conclusion

To conclude, I created the term “Triplication” as I needed a way to succinctly explain that things can get worse when you lump them together without a good understanding of why your doing this.  I needed a way to help challenge statements like, “you have to be able to extract efficiencies if you just lump all your teams together”. This thinking is equivalent to saying; “hey I have a great idea…! We ALL like music, right?? So lets save money – lets go buy a single CD for all of us!”

The reality for those that have played out the triplication scenario in real life, is that you will see costs balloon, progress grind to a halt, revenues fall of a cliff and the final step in the debacle is usually a loss of trust – followed by the inevitable outsourcing pill. On the other hand collaboration, scarcity, lean, quick MVPs, shared learning, cloud, open source, common rails and internal mobility are the friends of fast deliverables, customer satisfaction and yes – low costs!

Part 1: The Great Public Cloud Crusade…

“Not all cloud transformations are created equally…!”

The cloud is hot…. not just a little hot, but smokin hot!! Covid is messing with the economy, customers are battling financially, the macro economic outlook is problematic, vendor costs are high and climbing and security needs more investment every year. What on earth do we do??!! I know…. lets start a crusade – lets go to the cloud!!!!

Cloud used to be just for the cool kids, the start ups, the hipsters… but not anymore, now corporates are coming and they are coming in their droves. The cloud transformation conversation is playing out globally for almost all sectors, from health care, to pharmaceuticals and finance. The hype and urban legends around public cloud are a creating a lot of FOMO.

For finance teams under severe cost pressures, the cloud has to be an obvious place to seek out some much need pain relief. CIOs are giving glorious on stage testimonials, decrying victory after having gone live with their first “bot in the cloud”. So what is there to blog about, it’s all wonderful right…? Maybe not…

The Backdrop…

Imagine your a CIO or CTO, you haven’t cut code for a while or maybe you have a finance background. Anyway your architecture skills are a bit rusty/vacant, you have been outsourcing technology work for years, you are awash with vendor products, all the integration points are “custom” (aka arc welded) and and hence your stack is very fragile. In fact its so fragile you can trigger outages when someone closes your datacentre door a little too hard! Your technology teams all have low/zero cloud knowledge and now you have been asked to transform your organisation by shipping it off to the cloud… So what do you do???

Lots of organisations believe this challenge is simply a case of finding the cheapest cloud provider, write a legal document, some SLAs, find a vendor who can whiz your servers into the cloud – then you simply cut a cheque. But the truth is the cloud requires IP and if you don’t have IP (aka engineers) then you have a problem…

Plan A: Project Borg

This is an easy, problem – right? Just ask the AWS borg to assimilate you!!! The “Borg” strategy can be achieved by:

  1. Install some software agents in your data centers to come up with a total thumb suck on how much you think you will spend in the cloud. Note: your lack of any real understanding of how the cloud works should not ring any warning bells.
  2. Factor down this thumb suck using another made up / arbitrary “risk factor”.
  3. Next, sign an intergalactic cloud commit with your cloud provider of choice and try to squeeze more than a 10px discount out for taking this enormous risk.
  4. Finally pick up the phone to one of the big 5 consultants and get them to “assimilate” you in the cloud (using some tool to perform a bitwise copy of your servers into the cloud).

Before you know it your peppering your board and excos with those ghastly cloud packs, you are sending out group wide emails with pictures of clouds on them, you are telling your teams to become “cloud ready”. What’s worse your burning serious money as the consultancy team you called in did the usual land and expand. But you cant seem to get a sense of any meaningful progress (and no, a BOT in the cloud doesn’t count as progress).

To fund this new cloud expense line you have to start strangling your existing production spending, maybe you are running your servers for an extra year or two, strangling the network spend, keep these storage arrays for just a little while longer. But don’t worry, before you know it you will be in the cloud – right??

The Problem Statement

The problem is that public cloud was never about physically your iffy datacentre software with someone else; it’s was supposed to be about transformation of this software. The legacy software in your datacentre is almost certainly poisonous and in interdependencies will be as lethal as they are opaque. If you move it, pain will follow and you wont see any real commercial benefits for years.

Put another way, your datacentre is the technical equivalent of a swap. Luckily those lovely cloud people have built you a nice clean swimming pool. BUT don’t go and pump your swamp into this new swimming pool!

Crusades have never give us rational outcomes, you forgot to imagine where the customer was in this painful sideways move, what exactly did you want from this? In fact cloud crusades suffer from a list of oversights, weaknesses and risks:

  1. Actual digital “transformation” will take years to realise (if ever). All you did was just changed your hosting and how you pay for technology – nothing else actually changed.
  2. Your customer value proposition will be totally unchanged, sadly you are still as digital as a fax machine!
  3. Key infrastructure teams will start realising their is no future for them and start wandering. Creating even more instability.
  4. Stability will be problematic as your hybrid network has created a BGP birds nest.
  5. Your company signed a 5 year cloud commit. You took your current tech spend, halved it and then asked your cloud provider to give you discounts on this projected spend. You will likely see around a 10px-15px reduction in your EDP (enterprise discount program) rates, and for this you are taking ENORMOUS downside risks. You’re also accidentally discouraging efficient utilisation of resources. in favour of a culture of “ram it in the cloud and review it once our EDP period expires”.
  6. Your balance sheet will balloon, such that you will end up with a cost base of not dissimilar to NASA, you will need a PhD to diagnose issues and your delivery cadence will be close to zero. Additionally, you will need to create an impairment factory to deal with all your stranded assets.

So what does this approach actually achieve? You will of added a ton of intangible assets by balance sheeting a bunch of profees, you will likely be less stable and even be less secure (more on this later), you know that this is an unsustainable project and that it is the equivalent of an organisational heart transplant. The only people that now understand your organisation, are a team of well paid consultants on a 5x salary multiple and sadly you cannot stop this process – you have to keep paying and praying. Put simply, cloud mass migration (aka assimilation) is a bad idea – so don’t do it!

The key here is that your tech teams have to transform themselves. Nothing can act on them, the transformation has to come from within. When you review organisations that have been around for a while, they may have had a few mergers, have high vendor dependencies and low technology skills; you will tend to end up with the combined/systemic complexity suffering from something similar to Quantum Entanglement. Next we ask an external agency with a suite of tools to unpack this unobservable, irreducible complexity with a few tools, then we get expensive external forces to reverse engineer these entangled systems and recreate them somewhere else. This is not reasonable or rationale – its daft and we should stop doing this.

If not this, then what?

The “then what” bit is even longer that the “not this” bit. So I am posting this as it and if I get 100 hits I will write up the other way – little by little 🙂

Click here to read the work in progress link on another approach to scaling cloud usage…