Demystifying AWS Step Functions: Mastering Workflow Orchestration in the Cloud

Demystifying AWS Step Functions: Mastering Workflow Orchestration in the Cloud

Ever felt overwhelmed by orchestrating complex workflows in the cloud? Juggling multiple tasks, ensuring their order, and handling errors can be a tangled mess. That's where AWS Step Functions comes in, your knight in shining armor for streamlined workflow orchestration. It's your secret weapon for building robust, serverless workflows that seamlessly stitch together your cloud tasks, one step at a time. No more scripting intricate dependencies or juggling code, Step Functions seamlessly connects the dots, from Lambda functions to data pipelines, automating your processes with laser precision.

In the complex landscape of cloud computing, applications often comprise multiple services that need to work seamlessly together to deliver business processes. That's where Step Functions comes into play and empowers developers to effortlessly define, run, and scale workflows, offering a scalable solution for the complexities of orchestrating microservices-based architectures.

This blog delves into the depth of AWS Step Functions, uncovering its capabilities, use cases, and how it transforms the orchestrations of AWS services. Let's dive into the serverless workflow revolution and discover how it simplifies your cloud operations, making you a true master of automation.

Introduction to AWS Step Functions

Before going to step function, let's understand what Orchestration means.

Orchestration is an act of coordinating and automating complex workflows. It's like building a roadmap for your cloud tasks to accomplish a specific task or execute a series of operations, defining the order they run, how they interact, and what happens when things go uneven.

Serverless orchestration is a subset of orchestration specifically related to orchestrating tasks within the context of serverless architectures using multiple services as required. This is a specialized type of orchestration tailored for the serverless paradigm, where tasks run on demand without server management.

Step Functions also provides Workflow Studio which is a low-code visual workflow designer that lets you create serverless workflows by orchestrating AWS services. Using its drag-and-drop feature or the built-in code editor, you can create and edit workflows, control how input and output are filtered or transformed for each state, and configure error handling. When you're finished, you can save your workflow, run it, and then examine the results in the Step Functions console. You can visually add and modify workflows to orchestrate the multiple services in your application.

Through the Step Functions graphical console, you see your application’s workflow as a series of event-driven steps.

Step Functions takes care of the execution, monitoring, and even retries, freeing you from the orchestration headache. Plus, its seamless integration with other AWS services lets you seamlessly weave together diverse tasks, achieving powerful automation without breaking a sweat.

Types of AWS Step Functions

Step Functions provides several ways to manage your microservice workflows.

AWS Step Functions offers 2 distinct types to cater to different workflow needs:

  1. Standard Workflow

    • For long-running workflows, you can use Standard Workflows.
  2. Express Workflow

    • For short-duration, high-volume workflows that require an immediate response, synchronous Express Workflows are ideal.

    • For short-duration workflows that do not require an immediate response, Step Functions provides asynchronous Express Workflows.

When you create a state machine, you select a Type of either Standard or Express. The default Type for state machines is Standard. A state machine whose Type is Standard is called a Standard workflow and a state machine whose Type is Express is called an Express workflow.

Standard WorkflowsExpress Workflows
Long-running with access to visualization and full history in the console.Short-running with access to results in Amazon CloudWatch Logs.
Maximum duration is 1 year.Maximum duration is 5 minutes.
Exactly once execution model.Asynchronous (at least once execution model). Synchronous (at most once execution model).
Execution state internally persisted on every state transition.No internally persisted state for asynchronous executions. State machine logic must be idempotent.
Tailored for long-running, durable, and auditable workflows like an Amazon EMR cluster, or processing payments.Suited for high-volume, event-driven tasks or workloads such as Internet of Things (IoT) data ingestion, processing and transforming streaming data, and managing backends for mobile applications.
Priced per state transition.Priced by the number of times you run, their duration, and memory consumption.

States and Amazon State Language (ASL) in AWS Step Function

Step Functions are based on state machines which consist of states and tasks. In Step Functions, a workflow is called a state machine, which is a series of event-driven steps. Each step in a workflow is called a state. A Task state represents a unit of work that another AWS service, such as AWS Lambda, ECS, DynamoDB, SNS, SQS, etc. performs. A Task state can call any AWS service or API.

States are elements in your state machine. A state is referred to by its name, which can be any string but it must be unique within the scope of the entire state machine.

Individual states can make decisions derived from their inputs, execute actions based on these inputs, and transmit output to subsequent states. Within AWS Step Functions, you articulate your workflows in the Amazon States Language (ASL). The Step Functions console furnishes a graphical representation of your state machine, aiding in the visualization of your application's logic.

In AWS Step Functions, you define your workflows in the Amazon States Language. The Amazon States Language is a JSON-based, structured language used to define your state machine. Using ASL, you create workflows. Workflows are a collection of states that can do work (Task states), determine which states to transition to next (Choice states), stop an activity with an error (Fail states), and so on.

Common state fields while writing Amazon States Language (ASL) -

  1. StartAt (required) - A string that must exactly match (case sensitive) the name of one of the state objects.

  2. States (required) - An object containing a comma-delimited set of states.

  3. Type (required)- This is the state's type.

  4. OutputPath (optional) - This is a path that selects a portion of the state's input to be passed to the state's output. If omitted, it has the value $, which desi

  5. Next (optional) - This is the name of the next state to be run when the current state finishes. Some state types, such as Choice, allow multiple states.

  6. TimeOutSeconds (optional) - The maximum number of seconds an execution of the state machine can run.

  7. InputPath (optional) - This is the path that selects a portion of the state's input to be passed to the state's task for processing. If omitted, it has the value $, which designates the entire input.

  8. ResultPath (optional) - The output of a state can be a copy of its input, the result it produces (for example, output from a Task state’s Lambda function), or a combination of its input and result. Use ResultPath to control which combination of these is passed to the state output.

    The following state types can generate a result and can include ResultPath - Pass, Task, Parallel and Map.

    For more details, please refer to ResultPath

  9. End (required) - This designates this state as a terminal state (ends the execution) if set to true. There can be any number of terminal states per state machine. Only one of Next or End can be used in a state. Some state types, such as Choice, or terminal states, such as Succeed and Fail, don't support or use the End field.

Types of states in AWS Step Functions

There are 8 types of states which when combined and configured appropriately, enable you to design complex workflows. The flexibility provided by different states allows for the creation of dynamic and responsive workflows in response to a variety of conditions and inputs:

1. Task State

  • This represents a single unit of work in your workflow, executing specific tasks like:

    • Invoking Lambda functions

    • Calling AWS services (e.g., S3, DynamoDB)

    • Making HTTP requests to external APIs

  • You provide the task details (function name, input data, or parameters) and any required configuration.

2. Choice State

  • This represents a decision point in your workflow.

  • Evaluates input data or task results and based on the result of a Boolean expression, it transitions to different states.

  • Enables conditional logic for flexible scenarios.

3. Wait State

  • This represents pausing an execution for a specified time or until a signal arrives.

  • Useful for delays, waiting for external events, or implementing retry logic.

  • Can be used for scheduling tasks or coordinating with other systems.

4. Parallel State

  • This is a multitasker, that executes multiple branches of your workflow simultaneously.

  • Each branch is an independent sequence of states and the state machine waits for all branches to finish before proceeding.

  • Can significantly improve performance for certain workloads.

5. Pass State

  • The simple passer, forwarding input without performing actions.

  • Useful for transitioning between states without additional processing.

  • Helps organize and structure complex workflows.

6. Succeed State

  • The triumphant end that stops an execution successfully.

  • These are terminal states, they have no Next field and don't need an End field.

7. Fail State

  • The graceful error handler indicates a workflow failure.

  • Allows for error handling and corrective actions.

  • Triggers appropriate notifications or logging.

8. Map State

  • The iterator that executes a task for each item in an array.

  • Ideal for processing large datasets or collections of items in parallel.

  • Provides efficient and scalable iteration capabilities.

Tutorial: Creating a Step Functions state machine that uses AWS Lambda

In this tutorial, you will create a single-step workflow using AWS Step Functions to invoke an AWS Lambda function which is available in AWS documentation and can be accessed from here.

Let's start the tutorial which includes creating the lambda function first from the lambda console and then creating state machines from the Step Function console.

  1. Create a Lambda Function and this is based on the requirement. we are creating a basic one to understand the step function execution with the lambda function.

     export const handler = async(event, context, callback) => {
         console.log("EVENT ===>" , event)
         callback(null, "Hello from " + event.who + "!");
     };
    
  2. Test the Lambda Function by passing the event locally (using the actual event). This is done just to check whether our lambda function is working correctly or not.

  3. Create a State Machine from Step Function console which has a lambda function as a task and its configuration to produce correct output.

    and this is the definition associated with the step function state machine.

     {
       "Comment": "A description of my state machine",
       "StartAt": "Lambda Invoke",
       "States": {
         "Lambda Invoke": {
           "Type": "Task",
           "Resource": "arn:aws:states:::lambda:invoke",
           "OutputPath": "$.Payload",
           "Parameters": {
             "Payload.$": "$",
             "FunctionName": "arn:aws:lambda:<region>:<acc-id>:function:<function-name>:$LATEST"
           },
           "Retry": [
             {
               "ErrorEquals": [
                 "Lambda.ServiceException",
                 "Lambda.AWSLambdaException",
                 "Lambda.SdkClientException",
                 "Lambda.TooManyRequestsException"
               ],
               "IntervalSeconds": 1,
               "MaxAttempts": 3,
               "BackoffRate": 2
             }
           ],
           "End": true
         }
       }
     }
    

    Now, I have replaced the actual event with the default event, which results in undefined, as seen below and we will be passing the actual event directly from the step function console while starting the execution in the next step.

  4. Run the State Machine by providing the below input and start the execution from the Step Function Console.

     {
       "who": "AWS Step Functions"
     }
    

Input and Output Processing in Step Functions

When Step Functions is invoked, it receives a JSON text as input and passes that input to the first state in the workflow. Individual states receive JSON as input and usually pass JSON as output to the next state.

Step Functions can be effectively designed not only by understanding the flow of data from one state to another but also by knowing how to manipulate and filter the data. The fields that filter and control the flow from state to state in the Amazon States Language are:

  1. InputPath - It specifies the exact portion of the input you want to pass to a task.

  2. ResultPath - It determines where to store the output of a task within the workflow's overall output.

  3. OutputPath - It lets you modify or filter the output before sending it along. You might use OutputPath to remove sensitive information or combine multiple results for clarity.

  4. Parameters - They provide additional information or configuration to a task, influencing its behavior.

  5. ResultSelector - It lets you choose only the data you need from the task's results. For Example - You might use ResultSelector to extract a customer's ID or order status from a complex response, focusing on the essential details.

The following diagram shows how JSON information moves through a task state. InputPath selects which parts of the JSON input (state input) to pass to the task of the Task state (for example, an AWS Lambda function). ResultPath then selects what combination of the state input and the task result to pass to the output. OutputPath can filter the JSON output to further limit the information that's passed to the output.

Implementing Input and Output Processing in Step Function

Let's walk through the basic example regarding the input and output processing in the AWS Step Function.

  1. create the Lambda Function that adds the 2 numbers, returns the addition as a result, and stores that in add_number.

     import json 
    
     def lambda_handler(event, context):
         num1 = event.get("num1", "")
         num2 = event.get("num2", "")
         result = {
             "add_number": num1 + num2
         }
         return result
    

  2. Create a State Machine from the Step Function console which has a lambda function as a task that is referenced using its ARN, then there is an InputPath which is set to "$", indicating no modification to the input, and then there is ResultPath which is set to "$.output", storing the entire output in output section and the end is the last state of the state machine.

     {
       "Comment": "A Step Function example with InputPath, ResultPath.",
       "StartAt": "Addition",
       "States": {
         "Addition": {
           "Type": "Task",
           "Resource": "arn:aws:lambda:<region>:<acc-id>:function:<function-name>",
           "InputPath": "$",
           "ResultPath": "$.output",
           "End": true
         }
       }
     }
    

  3. Run the State Machine by providing the input and start the execution from Step Function Console and after the state machine execution is successful, you can see the output in the output section.

     // Input provided to the step function before starting execution
     {
       "num1": 5,
       "num2": 10
     }
    

AWS Step Function Integration with Other AWS Services

You have the flexibility to set up your Step Functions workflow to invoke various AWS services including :

  • Compute: Imagine tiny robots working for you! Services like AWS Lambda, Amazon ECS, and Amazon EKS let you run code without managing servers (serverless) or orchestrate containers (like Docker) for flexible, scalable computing. AWS Fargate takes serverless a step further by managing even containers for you.

  • Databases: Need a robust data home? Amazon DynamoDB is a super-fast, flexible NoSQL database perfect for applications with ever-changing data needs.

  • Messaging and Notifications: Think of megaphones and mailboxes for your cloud. Amazon SNS blasts messages to large audiences and Amazon SES is used when we want to send customized emails, while Amazon SQS acts as a queue for sending messages between services.

  • Data Processing and Analytics: Need to make sense of all your data? Amazon Athena lets you run queries directly on S3 storage like a virtual data warehouse. AWS Batch runs large-scale batch jobs, while AWS Glue and Amazon EMR help you build data pipelines and run analytics on big data. AWS Glue DataBrew even lets you clean and prepare data visually.

  • API Management: Think of a concierge for your cloud services. Amazon API Gateway lets you create secure, managed APIs to control access to your resources.

  • Machine Learning: Want your cloud to learn and adapt? Amazon SageMaker helps to prepare, build, train, and deploy ML models into a production-ready hosted environment.

  • SDK Integrations: Need to talk to all the other services in your cloud? AWS provides SDKs (software development kits) for over 200 services, letting you easily connect them all and unlock the full potential of your cloud setup.

Orchestrating Order Processing: A Step Functions Workflow with Lambdas Use case

Let's dive into the practical implementation of the step function.

  1. Firstly, we need to create 2 Lambda Functions (ProcessOrderLambda and SendShippingDetailsLambda).

    Here, I am creating a basic lambda function that just shows how the step function works and how it orchestrates multiple lambda functions and produces a desired result at the end of execution.

    The lambda function logic can be of any complexity as per the requirement.

    ProcessOrderLambda - This function needs the order_id and items as an event and then processes the order by calculating the total_price and later creating and returning the result object which maintains all the order details.

    Let's walk through each of the lambda functions in a bit of detail.

     # ProcessOrderLambda 
     import json 
    
     def lambda_handler(event, context):
         # Assume 'event' contains details about the order
         order_id = event.get("order_id", "")
         items = event.get("items", [])
    
         # Process the order (In a real scenario, you might interact with databases, external APIs, etc.)
         total_price = sum(item.get("price", 0) * item.get("quantity", 1) for item in items)
    
         result = {
             "order_id": order_id,
             "total_price": total_price,
             "status": "Processed"
         }
         return result
    

    SendShippingDetailsLambda - This function needs the processed_order as an event which then retrieves the order_id and later creates and returns the shipping_result object which maintains all the shipping-related details.

     # SendShippingDetailsLambda
     import json
    
     def lambda_handler(event, context):
         processed_order = event.get("processed_order", {})
    
         # Send a shipping notification (In a real scenario, you might use SNS, email, etc.)
         shipping_result = {
             "detail_sent": True,
             "message": f"Order {processed_order.get('order_id')} processed and shipped."
         }
         return shipping_result
    

  2. Create a State Machine from the Step Function console which has a 2 lambda function as a task in different steps that are being referenced using their respective ARNs, also we have a wait state that pauses the execution for a specified time or until a signal arrives.

     {
       "Comment": "E-commerce Order Processing State Machine",
       "StartAt": "ProcessOrder",
       "States": {
         "ProcessOrder": {
           "Type": "Task",
           "Resource": "arn:aws:lambda:<region>:<acc-id>:function:<function-name>",
           "InputPath": "$",
           "ResultPath": "$.processed_order",
           "Next": "UpdatingDetail"
         },
         "UpdatingDetail": {
           "Type": "Wait",
           "Seconds": 3,
           "Next": "SendShippingDetail"
         },
         "SendShippingDetail": {
           "Type": "Task",
           "Resource": "arn:aws:lambda:<region>:<acc-id>:function:<function-name>",
           "InputPath": "$.processed_order",
           "ResultPath": "$.shipping_result",
           "End": true
         }
       }
     }
    

  3. Run the State Machine by providing the input and start the execution from Step Function Console and after the state machine execution is successful, you can see the output in the output section.

     // Input provided to the step function before starting execution
     {
       "order_id": "123456",
       "items": [
         {
           "product": "Laptop",
           "price": 1200,
           "quantity": 1
         },
         {
           "product": "Mouse",
           "price": 20,
           "quantity": 2
         }
       ]
     }
    

    This is the input and output of step function execution.

    This is the output of the first step (ProcessOrderLambda) in the step function.

    The final result of the execution will include the processed order details and the shipping details.

This use case demonstrates the orchestration of tasks in a basic e-commerce order processing system using AWS Step Functions. The steps can be expanded based on the complexity of your business logic and integrations with other AWS services.

In summary, AWS Lambda executes individual functions triggered by events, while AWS Step Functions coordinates and orchestrates multiple AWS service interactions, including Lambda functions, in a workflow defined by a state machine.

Features of AWS Step Function

  1. Automatic Scaling

    AWS Step Functions automatically scales the operations and underlying compute to run the steps of your application for you in response to changing workloads. Step Functions scale automatically to help ensure the performance of your application workflow remains consistent as the frequency of requests increases.

  2. High Availability

    AWS Step Functions has built-in fault tolerance and maintains service capacity across multiple Availability Zones in each region to protect applications against individual machine or data center failures. This helps ensure high availability for both the service itself and for the application workflow it operates.

  3. Pay Per Use

    With AWS Step Functions, you pay for each transition from one state to the next. Billing is metered by state transition, and you do not pay for idle time, regardless of how long each state persists (up to one year). This keeps Step Functions cost-effective as you scale from a few executions to tens of millions

  4. Security and Compliance

    AWS Step Functions is integrated with AWS IAM and recommends a least-privileged IAM policy for all of the resources used in your workflow. You can access AWS Step Functions from VPC-enabled AWS Lambda functions and other AWS services without traversing the public internet using AWS PrivateLink.

References

Step Function - Amazon State Language (ASL)

Step Function - Input and Output Processing

Step Function - Workflow Studio

Step Function - Tutorials

Step Function - Working with other services

Conclusion

We've journeyed through the world of AWS Step Functions, unpacking its potential to orchestrate complex workflows with elegance and ease. From understanding its core concepts to crafting state machines and processing data, and some use cases to get the most out of the step function, and its features as well, hopefully, this blog has ignited, your enthusiasm for serverless orchestration.

Remember, Step Functions isn't just a tool, it's a paradigm shift. It liberates you from the tangled web of manual management, letting you focus on designing workflows that are not only functional but beautiful in their simplicity. Embrace the power of states, embrace the flow of data, and step into the future of cloud automation.

Thank you for dedicating time to read my blog! 😊 I trust you discovered it to be beneficial and insightful. If so, please show your appreciation with a 👍 like and consider 💌 subscribing to my newsletter for more content like this.

I'm continuously seeking ways to enhance my blog, so your comments or suggestions are always welcome 💬

Once again, thank you for your ongoing support!

Connect with me -

My LinkedIn

My Twitter

#aws #awscommunity #cloudcomputing #cloud

Did you find this article valuable?

Support Cloud & Devops with Rachit by becoming a sponsor. Any amount is appreciated!