New AutoSpotting Version

Hi there,

Just in case you haven’t been following the AutoSpotting project, a few weeks ago we released a new version that improves it a lot, and we recommend you to give it a try. Below you can see a few highlights of the latest version and general news about the project.

The spot bidding engine was heavily refactored, using less memory and being much more scalable on large installations

Previously AutoSpotting was launching the spot instances by using the traditional ec2.RequestSpotInstances API calls. This API works pretty well, but it has some limitations that need to be handled by repeatedly polling looking for the newly launched instances, which was causing dozens of API calls for each new instance. The Lambda function could also run out of memory and became unreliable immediately after enabling it for the first time on AWS accounts with lots of groups due to API throttling. There were complaints from people running AutoSpotting on more than 100 groups and 500 instances in a single region, where initial runs would be failing to handle the newly launched instances and they were left lying around unattached to their groups. Some of the people even resorted to writing their own AutoSpotting clones to better address this situation, which was an unfortunate waste of effort.

This problematic API call was recently replaced with the ordinary ec2.RunInstances API call that can also be used for launching on-demand instances. This was enhanced in December last year to also enable the launch of spot instances, and addresses pretty much all limitations of the previous ec2.RequestSpotInstances API calls.

This offloads a lot of logic we previously needed to do ourselves, and allowed us to delete a significant amount of polling code. The code base has been significantly cleaned up and reduced in size by about a fifth, and in the process we also addressed some other issues you can see below, and slightly increased the test coverage.

Screenshot from 2018-07-13 21-28-53.png

This work was sponsored by HERE Technologies, who recently rolled out AutoSpotting on more than 400 AWS accounts and needed it in order to properly handle their largest accounts. It was successfully tested on one of their largest AWS accounts, where we didn’t notice any problems, and the memory usage was much lower than before. Extrapolating this test, we estimate that the currently default configuration should safely handle more than 1000 node replacements per AWS account, and could potentially be extended to about 5000 by increasing the Lambda function’s memory allocation. We don’t have access to such a large setup, but if you do please try it out and let us know what you see, we’d like to hear from you.

Better handling of out of capacity situations

As of December last year, AWS also fundamentally changed the way the spot market works by making the prices stable over time, and at the same time decoupling the bid price from the launch and termination of spot instances.

This change is largely beneficial, but unfortunately had some unforeseen implications on AutoSpotting because it can often fail to launch instances even if the bid price is much higher than the market price. This could cause AutoSpotting to launch multiple instances when the spot capacity couldn’t be fulfilled within a single Lambda function run. These instances were never tagged so in some circumstances the would not be set up but remained running outside the group.

This has been addressed out of the box by the ec2.RunInstances API call,which is automatically cancelling the spot instance requests it creates on our behalf if they can’t be fulfilled immediately. AutoSpotting will simply fail after a timeout of a few seconds, and will retry to launch another instance in the next runs until it succeeds.

Better handling of VPC, DefaultVPC and EC2 Classic security groups

The API for launching instances had some issues with the way we have to configure the security groups on newly launched spot instances. This sometimes needed groups to be given by ID, while other times by name, so we needed to do some ugly workarounds in order to support VPC, DefaultVPC and EC2 Classic at the same time. Even with those, it was failing to handle some edge cases, such as on DirectVPC environments created by CloudFormation code designed for EC2Classic, like it’s the case with ElasticBeanstalk environments running in Default VPC.

This has been addressed out of the box by the ec2.RunInstances API call, which can always accept security groups given by ID, so we could clean up all those workaround, and the code was tested and seemed to work reliably on all these flavors of EC2.

Support running in opt-out mode

By default AutoSpotting runs in opt-in mode, only taking over the groups tagged with a certain tag (by default “spot-enabled=true”) while ignoring all the other groups. This allows people to test it properly and gradually extend the rollout to their groups as they gain confidence in it. But eventually they may be confident enough to be able to enable it by default on all their groups, regardless if this tag was set or not. This feature can also be enforced by companies running lots of AWS accounts, who can use it by default on all their development accounts where the risk of suddenly enabling this feature can be relatively small but the benefit substantial enough to worth it.

This can be enabled by using the latest CloudFormation or Terraform infrastructure code, where it is exposed as an additional parameter. The default tag is “spot-enabled=true”, but this is configurable.

This work was also sponsored by HERE Technologies, who plans to eventually enable it on all their 200+ development-only AWS accounts, potentially generating savings in the millions dollars monthly.

Tagging improvements

AutoSpotting is now tagging all the launched spot instances with “launched-by-autospotting=true”, and it can also tag its own Lambda function at install time with a tag configurable at install time.

This work was also sponsored by HERE Technologies, who plans to use this to measure the overall runtime cost and the savings generated by AutoSpotting across their entire fleet.

Instance type expansion and price updates

AutoSpotting now has support for all the recently launched instance types, such as the C5D and M5D, including the automated compatibility checks for their storage volumes. The pricing information is now also up-to-date so we can use it to set more accurate bid prices.

Instance scale-in protection support

AutoSpotting now considers the scale-in protection flag that can be set on AutoScaling group members. Support for Instance Termination protection is still work in progress, and is expected to be released within the next few weeks.

This work was contributed by Jam ‘codejamninja’ Risser.

Fix compilation of macOS

AutoSpotting can now be built on Mac, without using Docker or VMs.

This work was also sponsored by HERE Technologies,

Smaller binaries

The binaries are now stripped of debugging information, which decreases them by about 20%

This work was also sponsored by HERE Technologies.

Terraform module in the Terraform Registry

Thanks to a contribution by Neill Turner, our slightly modified Terraform module is now published to the Terraform Registry, which makes it much easier to install by Terraform users, as easy as

module "autospotting" {
  source  = "cristim/autospotting/aws"
  version = "0.0.9"
}

HERE Technologies now supports development of AutoSpotting

The fact that you noticed so many contributions sponsored by HERE Technologies is not a coincidence. I am their full-time employee for a lot of time now, and as of a couple of months ago they started to allow me to work on AutoSpotting for 20% of my employment time in order to roll it out widely on their large fleet of AWS accounts. They also employ a couple of other occasional contributors, Artem Nikitin and Johannes ‘lenucksi’ Tigges.

Huge thanks to HERE for their support and the trust they put in this project.

New Patreon members

We recently got a few new Patreon members. Huge thanks to Golo Roden and Sumit Sarkar, who both pledged to donate $40 to the project each month in order to get access to the official binaries, while our first backer Ivan Kalinin keeps his $5 pledge started a few months ago.

If you really like this project and/or your company benefits significantly from it, please consider joining them, this makes a huge difference to its further development. You also receive better support and pre-built binaries that can be easily rolled out.

What’s next?

We’ll continue polishing AutoSpotting as part of the rollout at HERE Technologies, addressing various issues reported by their development teams. The first on the list so far is support for instance termination protection, which will hopefully land within a few more weeks.

Also, Artem is pretty advanced in his XRay support patch, which will enable us to instrument the runtime performance of AutoSpotting and see what are the bottlenecks on really large installations, so we can improve it further.

That’s all folks

Thanks for reading so far and I hoped I convinced you to give the latest version of AutoSpotting a try.

If you have any questions or comments please get in touch, I’ll personally answer to all of you

-Cristian

Advertisements

AutoSpotting now handles complex launch configurations when replacing your EC2 instances with cheaper spot ones, and also got open-sourced.

Later Update: The code is now available on Github: https://github.com/cristim/autospotting

Today I finally reached a great milestone: for the first time I was able to get AutoSpotting provide spot instance replacements for AutoScaling groups of on-demand instances having full-blown real-life configurations with things like IAM roles, enhanced monitoring and attached IP addresses.

For those of you who are not familiar with it, have a look at this presentation as well as my previous blog posts where it’s explained what it does and it is presented in more detail.

In addition, for those folks who are still running stuff on EC2-Classic environments, the latest build now also supports EC2-Classic security groups, which means that EC2-Classic works as well.

In order to test all this for real, I enabled it on an existing development environment running on EC2-Classic, which from the infrastructure perspective happens to be configured almost identically to those that are serving https://maps.here.com, and I’m happy to say that it worked like a charm:

autospotting

In the image above you can see a screenshot taken while replacing the group’s instances.

Notice how in eu-west-1c we actually got a m1.medium instance which was chosen in order to spread to multiple instance types because at that time we used to have another m3.medium instance in that Availability Zone, since choosing the same instance type on too many machines may become risky.

Currently the algorithm prefers the cheapest instance type, but in order to avoid placing all the eggs in the same basket, when we have more than 20% of the total group’s capacity of the same spot instance type within a single Availability Zone, the next cheapest instance type from that zone is chosen in order to reduce the chance of simultaneous failures of too many instances in case of sudden price fluctuations.

To make things even more interesting, during the replacement process one of the new spot instances failed to be fully configured and didn’t become healthy when its grace period was over(we just happen to have an overkill setup process running at instance startup which sometimes fails to finish during the allotted grace time), so it was terminated by AutoScaling immediately after being added to the group. AutoScaling soon replaced the failed instance with another on-demand instance, later to be replaced by a new cheaper spot one. But eventually the group converged to a fully spot configuration.

Also because the group’s scaling policy is currently based on CPU usage and has a quite low threshold, in the middle of all this replacement process a high CPU alarm fired due to the high load caused by the bootstrap of one of the new spot instances, so another new instance was launched by AutoScaling, only to be replaced by a new spot instance that was later teminated by a subsequent scale-in operation.

Eventually all this churn ended, and a group that would previously cost about $98 on a monthly basis, would now cost less than $17 assuming the price remains stable, which is more than 5.5 times cheaper on the long term.

So all in all it looks pretty good and reliable enough for dev environments (but I wouldn’t immediately put it in production) and it allows for huge cost savings. Feel free to give it a try using these instructions and let me know if you have any issues.

Before anyone asks, the software is not yet open sourced, but the review process is advancing fast and some important approvals are already there, so it’s now a matter of just a few more weeks.

Many of the latest improvements were developed with a lot of help from @nmeierpolys. His bug reports, suggestions and patience during multiple rounds of testing were priceless, and I am very thankful for all his contributions.

Known issues:

  • It is currently broken for environments where the instances are set up depending on information set on their EC2 tags. This is due to the fact that currently the instance tags are set on the new instances very late, at the same time when the new instance is added to the AutoScaling group. So in case the user_data script depends on information derived from the instance tags, the information would very likely be missing at the time the instance runs the user_data script and the instance would fail to be configured. I am planning to set the EC2 tags much earlier, but your user_data script shouldn’t rely they are there when the instance was started.
  • The issue mentioned above was fixed as of July 17. The EC2 tags are now set as soon as the new spot instances are launched.

Automatic replacement of Autoscaling nodes with equivalent spot instances: seeing it in action

Over the last few days since my previous post I’ve got thousands visitors, dozens of comments were posted and a few brave souls were even audacious enough to give it a try. Many of you provided valuable feedback and bug reports, so thank you all and keep the feedback coming!

I am now quite busy improving the software based on the feedback I’ve got so far and also on some bugs I found on my own, but before I have anything ready to be released,  I thought I should post some kind of HOWTO that shows how to install and set it up, and shows a demo of the instance replacement process, also exposing the currently known issues you should expect when using it at this point.

It’s still not ready for production usage, but I’m working on it.

Installation

The initial set up is done using CloudFormation, so you will need to launch a new CloudFormation stack. Since the Stack creates a lambda function, due to a Lambda limitation you can only launch the stack in us-east-1 (Virginia), but the stack can handle resources in all the other regions available to normal AWS accounts. For multiple reasons, at the moment the Beijing and GovCloud regions are unsupported.

Using the AWS console

Follow the normal stack creation process shown in the screenshots below.

The template URL is all you need to set, it should be https://s3.amazonaws.com/cloudprowess/dv/template.json

installationCloudFormation
Stack creation based on my template

Give the stack a name, then you can safely go through the rest of the process. You don’t need to pass any other parameters, just make sure you confirm everything you set so far and acknowledge that the stack may create some IAM resources on your behalf.

installationCloudFormation2
Naming the stack

 

If everything goes well the stack will start creating resources.

installationCloudFormation4
Creating the stack

And after a few minutes you should be all set.

installationCloudFormation5
The stack is ready

Using the AWS command line tools

If you already have installed the AWS command line tools, you can also launch the stack using the command line, using the following command:

aws cloudformation create-stack \
--stack-name AutoReplaceWithSpot \
--template-url https://s3.amazonaws.com/cloudprowess/dv/template.json \
--capabilities CAPABILITY_IAM

Configuration for an AutoScaling group

The installation using CloudFormation will create the required infrastructure, but your AutoScaling groups will not be touched unless you explicitly enable this functionality, which has to be done for each and every AutoScaling group which you would like to manage.

The managed AutoScaling groups can be in any other AWS region, the algorithm will run on all the regions in parallel, handling AutoScaling groups if and only if it was enabled for them. It makes no difference if your group is running EC2 Classic or VPC instances, since both are supposed to be supported. If you notice any issues when testing it in your setup, that’s likely a bug and would need to be reported.

Enabling it on an AutoScaling group is a matter of setting a tag on the group:

Key: spot-enabled
Value: true

This can be configured with the AWS command-line tools using this command:

aws autoscaling create-or-update-tags \
--tags ResourceId=my-auto-scaling-group,ResourceType=auto-scaling-group,Key=spot-enabled,Value=true,PropagateAtLaunch=false

 

If you use the AWS console, follow the steps that you can see below:

beforeAutoScaling
Initial state of the AutoScaling group
enabling
Tagging the AutoScaling group where it is being enabled

The tag isn’t required to be propagated to the new instances, so that checkbox can remain empty.

The AutoScaling group tags can also be set using CloudFormation, just insert this snippet into your AutoScaling group’s configuration:

"MyAutoScalingGroup": {
  "Properties": {
    "Tags":[
    {
      "Key": "spot-enabled",
      "Value": "true",
      "PropagateAtLaunch": false
    }
    ]
  }
}

Walkthrough

Going forward I’m going to show what happens after enabling it on an AutoScaling group.

Once it was enabled on an AutoScaling group, the next run will launch a compatible EC2 Spot instance.

Note: the new spot instance is not yet added to any of your AutoScaling groups.

The new instance type is chosen based on multiple criteria, and as per the current algorithm(this is a known issue and it may be fixed at some point) it may not be the cheapest across all the availability zones, but it will definitely be cheaper and at least as powerful as your current on-demand instances.

As you can see below, it launched a bigger m3.medium spot instance in order to replace a t1.micro on-demand instance. This also means that you can get bigger instances, such as c3.large spot instances, as long as their prices is the smallest of the instance types compatible with your base instance type.

beforeInstances
Initial state of the EC2 instances
spotInstanceStarted.png
Launching a new Spot instance, for now running outside the group

The new instance’s launch configuration is copied with very small modifications from the one set on your on-demand instances, so the new instances will be as closely as possible configured to the instances previously launched by your AutoScaling group.

Note: We try to copy everything, including the user_data script, EC2 security groups(both VPC and Classic), IAM roles, instance tags, etc. If you notice any gaps, please report those as bugs.

After the spot instance is launched and running out of its grace period(whatever was set on the AutoScaling group), it will be added to the group, and an existing on-demand instance will be terminated.

The AutoScaling group also adds it automatically to any load balancer configured for the group, so the instance will soon start receiving traffic. In case of instances that start handling requests as soon as their user_data script finished executing, like for example if you are processing data from an SQS queue, that may have already happened a while back, so the instance may already be in use even before being added to the group.

duringInstances
Spot instance added to the group, replacing an on-demand instance

Known bug: At the moment if the group is at its minimum capacity, the algorithm needs another run and temporarily increases the capacity in order to be able to replace an on-demand instance, and this should be more or less harmless assuming that the AutoScaling rules will eventually bring the capacity back to the previous level. Sometimes this can interfere badly with your scaling policies, in which case you may enter a spinning AutoScaling condition. It can be mitigated by tweaking the AutoScaling scaling down policy to make it less aggressive, like by setting a longer wait time after scaling down. This bug should be solved in the next release. This bug was fixed.

Continuing, in the next run, a second spot instance is launched outside the AutoScaling group:

secondSpotInstance.png
Second spot instance was launched

Then, after the grace period passed, it is added to the AutoScaling group, replacing another on-demand instance that is detached from the group and terminated:

secondSpotInstanceAdded
Second on-demand instance was replaced

This process repeats until you have no on-demand instances left running in the group, and you are only running spot instances.

If AutoScaling takes any scaling actions, like terminating any of the spot instances or launching new on-demand ones, we don’t interfere with it. But later we will attempt to replace any on-demand instances it might have launched in the meantime with spot equivalents, just like explained before.

Currently, due to the bug I mentioned previously, my setup ended up in a spinning state, but I managed to stabilize it by increasing the AutoScaling group’s scaling down cooldown period, and it eventually converged to this state: This bug was fixed.

result.png
Final state

Once I eventually release that bugfix, the group should converge to that state by itself, without any changes and much faster. This issue was fixed, the instances should be replaced smoothly.

Conclusions

Many people commented asking how does this solution compare with other spot automated bidders, such as the AWS-provided AutoScaling integration and the spot fleet API, as well as other custom/3rd party implementations.

I think the main differentiator is the ease of installation and use, which you can see in this post. There are a few rough edges that will need some attention, but I’m working on it.

Please feel free to give it a try and report any issues you may face.

Please see my initial post for announcements about software updates.

 

My approach at making AWS EC2 ~80% cheaper: Automatic replacement of Autoscaling nodes with equivalent spot instances

Note

AutoSpotting, the tool described here, was since open sourced and is now available on GitHub.

This post merely states the development history of the tool, it is seriously outdated, and only kept here for historical reasons. It was written in April 2016 and it is describing the state of the tool at that moment. Please refer to the GitHub page for more up-to-date information about the current state of the project.

Getting started

Last year, during one of the sessions of the Berlin AWS meetup where I am often present, during the networking that happened after the event @freenerd from Mapbox mentioned something about the spot market, saying how much cheaper it is for them to run instances there, but also the fact that for their use case it sometimes happened that the instances were terminated in the middle of their batch processing job that prepares the map for the entire world.

A few weeks later, at another session of the AWS meetup, I participated in a similar discussion where someone mentioned the possibility to have instances attached to an on-demand AutoScaling group, which was a feature just released by AWS at that time. I don’t remember if spot was mentioned in the same discussion, or if it was all in my mind, but somehow these concepts got connected and I thought this is a nice problem to hack on.

I was thinking about the problem for a while, and after a couple of weeks I came up with an algorithm based on the instance attach/detach mechanism supported by AutoScaling. I tested it manually and I quickly confirmed that AutoScaling happily allows attaching spot instances and detaching on-demand ones in order to keep the capacity constant, but that it often tries to rebalance the availability zones, so in order for it not to interfere with the automation, the trick is to try to keep the group more or less balanced across availability zones, so that AutoScaling won’t try to rebalance it.

I soon started coding a prototype in my spare time, which is actually my first non-trivial program written in a while, and to make it even more interesting, I chose to write it in golang.

Slow progress

After a few weeks of coding, in which I rewrote it at least twice(and even now I’m still nowhere near being happy with how it looks), I realized it’s quite a bit harder and more complex than I initially thought. Other things happened and I kind of lost interest, I stopped working on it and it all got stuck.

A few months later at the re:invent conference I attended some talks where I met some other folks interested by this problem and I saw other approaches of attacking the problem, with multiple AutoScaling groups, and that was also when I first got in touch with someone from SpotInst who was trying to promote their solution and was sharing business cards.

After re:invent I became a bit more active for a while, I also tried to get some collaborators but failed at it, so I kept working on it in my spare time every now and then and I got closer to get it work. Then I recently had a long vacation, and immediately after I returned I attended the Berlin AWS Summit, where I met the SpotInst folks once again, and it seems they now have a full fledged solution based on pretty much a reimplementation of AutoScaling as a SaaS, they are apparently successful with it. This motivated me to work even harder on this, since my solution is simpler, cleaner and just as effective as theirs.

Breakthrough

After the Berlin AWS Summit, having my batteries charged, I resumed my work and after a few coding nights I managed to make my prototype work. It took much longer than expected, but at least I got there, yay! 🙂

What I have so far

  • An easy to install CloudFormation template that creates an SNS topic, a Lambda function written in golang(with a small JS wrapper that downloads and run it), subscribed to the topic and a few IAM settings to make it all work (update: this was largely simplified since)
  • A golang binary, for now closed source(update: not anymore), but I’m going to open it up once I get it in a good enough shape so that I’m not ashamed of it and after I get all the approvals from my corporate overlords, who according to my employment contract need to approve the publishing of such non-trivial code

 

How does it work

The lambda function is executed by a custom CloudFormation resource when creating the CloudFormation stack from the template, and it subscribes to both your topic and a topic that I run, which fires it every 30 minutes, using a scheduled event.

When my scheduled function runs the lambda function, it will concurrently inspect the AutoScaling groups from all the AWS regions and it will ignore all those that are not tagged with the EC2 tags it expects.

The AutoScaling groups marked with the expected tag will be processed concurrently, on each of them gradually replacing the on-demand instances with compatible spot instances, one at a time. Each run will either launch a single spot instance or attach a launched spot instance to the AutoScaling group, after detaching an on-demand one it is meant to replace. The spot instance is not attached while its uptime is less than the Autoscaling group’s grace period.

The spot instance bid price matches the price of the on-demand instance it is meant to replace. If your spot request is outbid, AutoScaling will handle it as a regular instance failure, and will immediately replace it with an on-demand instance. That instance will later be replaced by the cheapest available compatible spot instance, likely of a different type and with a different spot price.

In practice the group should converge to the most stable instance pricing, no the long term saving about 80% from the normal on-demand EC2 price.

How to use it/Getting started

All you need to do is set an EC2 tag on the AutoScaling group where you want to test it. Any other AutoScaling groups will be ignored.

The tag should have the following attributes:

Key: “spot-enabled”

Value: “true”

See my next blog post for a full installation and runtime walkthrough where you can see very detailed instrunctions on how to get started.

Feedback is more than welcome

If you find any bugs or you would like to suggest any improvements, please get in touch on gitter or file an issue on GitHub.

Warning

This is experimental, summarily tested and likely full of bugs, so you should not run it on production, but it should be safe enough for evaluation purposes.

Anyway, use it at your own risk, and don’t hold me responsible for any misuse, bugs or damage this may cause you.

Update: many of these issues were ironed out since and the tool is currently stable enough for production use cases. It is already in use in dozens of companies, where it is already generating considerable savings. Feel free to also give it a try and report your feedback on gitter.

Later Updates

  • 27 Apr 2016
    • if you want to see it in action, please also check out my next blog post which walks you in detail through the installation and setup process, also explaining the currently known issues and their workarounds as of the time of the writing of this post.
  • 5 May 2016
    • bug fixes since the previous update
      • no longer spinning the AutoScaling group when running at minimum capacity
      • increased runtime frequency to once every 5min to make it converge faster
    • currently known issues
      • when first enabling it on a group with multiple on-demand nodes, it sometimes may launch extra spot instances that do not get added to the AutoScaling group(up to as much as the initial size of the group). workaround: terminate them manually from the AWS console. They will not be re-launched
      • spinning condition when the AutoScaling group is set to a fixed size (Min=Max). Workaround: set Max to Min+1 and disable any scaling actions you may have configured, in order to keep the group at the minimum capacity
    • things currently being worked on
      • fixes for the known issues mentioned above
      • major under-the-hood code refactoring in preparation of open sourcing
      • choosing instance types that are unlikely to be terminated in the near future, based on historical stats data sourced from the Spot Bid Advisor
      • mark spot instances as protected from termination by AutoScaling
      • if you have any other suggestions please write a comment below.
  • 10 July 2016
    • new features
      • improved algorithm for picking the new spot instance type
        • always launch a new spot instance  from the same zone of an existing on-demand instance. Previously the zone was the one where we had the less instances, which often may have been one where we had no running instances and no way . This was causing the bugs about spinning and launch of additional spot instances that were not added to the group.
        • allow multiple spot instances of a given type for each availability zone, as long as their total number is less than 20% of the total capacity from the group. For example a group of 15 instances using 3 availability zones will allow for 3 identical instances per availability zone, but the fourth instance from a zone will be of a different instance type.
      • Lambda wrapper updates
        • rewrote the Lambda wrapper in Python, which makes it more maintainable, since I’m much better at Python than at JavaScript
        • Implement versioning for the binary blob, by downloading the latest version only if not already there, based on the content SHA hash
      • CloudFormation cleanup
        • remove the SNS topic that was never used
      • support the new Mumbai region(still needs testing)
      • internal code refactoring
        • lots of cleanups that make it more maintainable
        • improved logging
    • bug fixes since the previous update
      • it should no longer start additional spot instances, the final capacity should match the original on-demand capacity, unless there were any AutoScaling actions.
      • fixed spinning condition with fixed-size AutoScaling groups by temporarily increasing the group during the replacement process
      • fix choosing of the cheapest compatible/redundant enough spot instance type, previously any cheaper instance type may have been chosen, not necessarily the cheapest.
      • lots of other small bugfixes for various edge cases
    • things currently being worked on
      • open sourcing process was started, and I already got some of the required approvals
      • figuring out how to implement automated testing
    • backlog
      • choosing instance types that are unlikely to be terminated in the near future, based on historical stats data sourced from the Spot Bid Advisor
      • mark spot instances as protected from termination by AutoScaling
  • 13 July 2016
    • Mostly bugfixes, many thanks to @nmeierpolys for some very valuable bug reports, fast feedback and a lot of patience while testing it.
      • improved conversion of on-demand launch configuration fields into spot launch configuration equivalents. In addition to user_data, SSH keys, EBS volumes and many other mostly trivial to convert fields that were previously handled, the following more complex fields should now also be better handled, which make it work on much more real-life environments:
        • EC2 Classic security groups
        • detailed instance monitoring
        • associating public IP addresses
      • It was successfully tested on complex EC2-classic and VPC setups where many of these fields were being used.
      • Compatibility notice: In the likely event that you are using IAM roles on your instances, you need to update to the latest version of the CloudFormation template, since the launch of such spot instances would otherwise fail due to missing IAM permissions required to run instances set up with IAM roles. Again thanks to @nmeierpolys for finding out this issue and proposing the fix.
  • 18 July 2016
    • Improve handling of storage volumes.
      • Bugfix: Fix panic while copying EBS storage configurations.
      • New feature: Implement compatibility check for storage volumes based on the number of attached ephemeral disk volumes present in the launch configuration. For example an instance which has a launch configuration that attached it a couple of the ephemeral SSD drives of a certain size would only be replaced by instance types which provide SSD devices at least as many and of at least the same size each, in order not to violate the storage expectations from the new instances.
  • This is largely outdated, current information about the project can be seen on GitHub