Elevating Orchestration: Navigating AWS Step Functions’ Latest Features

Chloe McAree
7 min readJan 29, 2024

--

I am a huge fan of AWS Step Functions and have been increasingly using them in both professional and personal projects. At this year’s AWS re:Invent, some pretty epic new features for Step Functions were announced, and I couldn’t wait to get my hands on them and try them out! In this blog, I’ll cover why I love Step Functions, the benefits they’ve brought me, and introduce some of the new features, discussing how they can impact future projects.

Why Step Functions?

Over the years, I have been working with various AWS serverless services, such as Lambda, EventBridge, SQS, DynamoDB, etc. A challenge I’ve encountered with more serverless services is that as complexity grows, it becomes harder to orchestrate, visualise, and inspect the connections between services. This is where Step Functions come in.

Step Functions are a serverless orchestration solution that plays a pivotal role in managing multiple services. They provide a holistic view and management of the interconnection between services, which is crucial for complex architectures. Step Functions integrate with over 220 AWS services, are fully managed, scale automatically, and operate on a pay-per-use model.

Whats new with Step Functions?

1. Redrive from point of failure

This feature was actually announced two weeks before re:Invent, but I feel it’s a super powerful piece of functionality, so I wanted to include it.

Previously, when building State Machines within Step Functions, if a particular step in your State Machine failed, you would have to re-execute the entire State Machine again, incurring the costs of all those state transitions once more.

Now, if you encounter broken downstream logic within your State Machine or errors you did not catch, you can resolve those issues and then re-drive the State Machine execution from the point of failure. This preserves anything that completed successfully before the failure without having to restart entirely.

Here’s an example from my current State Machine: you can see there are a number of steps that have successfully passed, but unfortunately, the last state failed to execute due to an IAM permission error.

Fail State Machine execution

I can now go in and fix this IAM issue, and Redrive this execution from the point of failure. To do this within the WorkFlow studio, you can select the Actions drop down menu and select Redrive.

Select Redirve in WorkFlow Studio

Once you click Redrive, it will show you from which state it is going to Redrive. Here, we can see it is where the execution failed last time.

With the fix in place, we can now see that the execution succeeded! One more thing to note about the Redrive functionality is the observability you have on it. In the screenshot below, you can see for this execution, it’s evident that a Redrive took place and how many were completed as part of the execution history.

It’s worth noting that you cannot use Redrive if you have changed the State Machine’s definition or the input payload.

2. HTTP State

A major update for Step Functions announced at re:Invent was the ability to directly make HTTP requests from your State Machine to call any public API.

This feature can significantly expand the capabilities of your applications by allowing integration with virtually any public API. For example, you could post Slack notifications at different stages of execution, reach out to Salesforce for sales data insights, use Stripe for payments, and so much more. To authenticate these requests, you can use EventBridge connection resources for API destinations that support OAuth, Basic Authentication, and API keys.

EventBridge Connection for Auth

When using the HTTP state, you also have the ability to handle errors by checking the status code returned from your request. Here, you could implement a choice state to take different actions depending on whether the request was successful or if there was an error.

A huge advantage of this feature is that it reduces the need for intermediary Lambda functions for API calls, streamlining workflows, reducing costs and cold starts.

For instance, in one of the State Machines I am currently working on, I have an Invoke Lambda state that triggers a Lambda function responsible for authentication. This Lambda reaches out to the Secrets Manager for an API key, calls out to the authentication service, retrieves a session token, and then stores this session token in DynamoDB. With this new feature, the entire flow can now be accomplished without the use of a Lambda function. For example:

3. Test API

Another amazing announcement regarding AWS Step Functions was the Test API. This new feature adds a powerful dimension to testing within AWS Step Functions, allowing users to thoroughly test individual states without the need to create or update an entire state machine.

One of the core functionalities of the TestState API is its ability to test a state’s input and output data flow, including processing and transformations. This is crucial for ensuring that data is being handled correctly at each step of the workflow. Additionally, it facilitates testing of AWS service integrations, allowing developers to verify how a state interacts with requests and responses from other AWS services. The Test API also supports testing HTTP Task requests and responses, expanding the scope of testing to include external HTTP-based interactions.

Here is a simple example of using the Test State feature through the Work Flow Studio, but you can also use it through the AWS SDK or CLI. Within your State Machine, you can click into the state you wish to test and you will now see there is a Test State button.

You can see here that you are able to test with different state inputs and can view the input, afterInputPath and output:

The Test API supports testing a range of state types, including all Task types (except Activities), as well as Pass, Wait, Choice, Succeed, and Fail states. However, there are limitations to keep in mind. The TestState API does not support testing of Task states that use Activities, Parallel states, Map states, or integration patterns like .sync or .waitForTaskToken.

4. Bedrock Integration

The recent announcement of AWS Step Functions’ integration with Amazon Bedrock marks a significant advancement in orchestrating generative AI tasks within AWS workflows. This integration not only simplifies the process of incorporating AI capabilities into applications but also reduces the need for additional coding and associated costs.

Two new Bedrock integrations have been introduced:

  1. Invoke Model: Directly invoke a model with prompts, with the option to interact with S3 for large payloads. It enables the invocation of a model to run inferences with provided input parameters. This can be applied to text, image, and embedding models

2. Create Model Customisation Job: Creates a job to fine-tune a base model, specifying the foundation model and training data location. It’s an asynchronous API, allowing Step Functions to pause execution while the job runs and automatically resume upon completion.

A huge advantage of this integration is that it facilitates the orchestration of interactions with foundation models, including steps for human intervention, without the need for extra integration code. Leveraging the Step Functions Workflow Studio, users can now also visually develop, inspect, and audit generative AI workflows. Overall, this integration adds a layer of sophistication to workflows, enabling complex tasks such as prompt-chaining and model customisation.

5. App Composer Integration

Now available in IDEs, this integration brings Workflow Studio into your development environment. This allows developers to design, edit, and manage workflows directly within their familiar coding environment, enhancing efficiency and reducing context switching.

Application Composer enables you to launch Step Functions Workflow Studio directly, providing a unified interface for workflow creation and management. Users have the flexibility to create new workflows or import existing ones into the Application Composer. This feature is particularly useful for integrating workflows that have been previously defined or need refinement. The Application Composer canvas facilitates the integration of workflows with other AWS resources, thereby enhancing the scope and functionality of these workflows.

As workflows and applications are designed within the Application Composer, it automatically generates Infrastructure as Code (IaC). This feature is a significant time-saver, guiding users toward deployment with less manual coding effort.

Conclusion

These updates to AWS Step Functions, announced at re:Invent, significantly enhance its utility for building and managing complex serverless workflows. They focus on improving efficiency, reducing costs, and simplifying the integration of external services, making AWS Step Functions an even more powerful tool in the AWS ecosystem.

--

--

Chloe McAree
Chloe McAree

Responses (1)