A basic microservice that takes input from API Gateway and passes it on to other services or a queue. In my designs, I try to make this the only microservice that is taking direct input from API Gateway as it gives me a single place to closely inspect and monitor the incoming data for security or other issues. Typically, I will have API Gateway validate and structure the input parameters and add an “action” param based on the API path that was requested. This router function looks at that param and decides what to do with the data. It could pass it on to a basic CRUD function or task manager for example. It does this with an invocation type “RequestResponse” then waits for the response to return to the user via API Gateway. My router function will log all requests with the user and different information to Cloudwatch for tracking and analysis.
- Basic CRUD
Most APIs probably have one of these. A simple processing function that can do “create / retrieve / update / delete” actions on a specified database table, storage facility or other service. In some projects, there is the possibility of using the same function for multiple tables because authentication and data validation can be done entirely in API Gateway. In this case, the router function could pass along a param for the appropriate table to perform the actions on based on the API request path. These functions offer a quick response to the user but they lose some resilience compared to decoupled functions. (AWS Amplify will generate a Lambda function for this if you indicate a CRUD type when creating your API.)
- Decoupled Microservice
These are microservices that sit behind an SQS queue. A router or task manager adds tasks to an SQS that triggers this function. The tasks are set to an “in progress” state while the function attempts to process them. Any failure or timeout would mean that SQS tries again until it succeeds. You can configure that after X retries, the tasks are moved to a “Dead Letter Queue” which can trigger alerts to an admin. If the function succeeds, it can initiate a follow-up action or return the task response to the task manager queue.
Because of the SQS queue, a decoupled microservice is asynchronous. This means that the function runs in the backend and the frontend should continue with whatever it needs to do and not wait for a response. If a response is needed from this microservice it will need to be handled as a new action and pushed to your frontend. For a web application, this can be solved by implementing sockets so that a push notification can be sent to the browser to let the user know the action has completed. Alternatively, you could keep polling an endpoint to see when it is ready, but this is not efficient. A good use case for a decoupled microservice would be an image processor.
More info about using SQS with lambda here
More info about setting up web sockets with API Gateway here
- Task manager
A task manager is a decoupled microservice that sits between a router and (in my case) decoupled microservices. The task manager has an SQS queue that the router can drop tasks into.
A task would at least have params for:
- The microservice that this task is for (could also be deducted from the action)
- The action to perform in that microservice (if there are multiple)
- The data that was provided for the action (if any)
- The target the response should go to (or blank if no response is needed)
The task is added to the SQS queue, which then triggers the task function. The task function assesses the task and passes the information onto the target. Flagging the task as in-progress for a specified timeout. If the target fails to handle the request, the task would become active again in the task queue and the task manager can try again for the configured number of retries.
For some simpler usecases, it’s also possible to use SNS instead of a task manager. The reason why I tend to use a Lambda task manager is that it gives me a bit more flexibility in how I handle, log, track and respond to the tasks being passed around.
This is what I call a service that has exclusive access rights to data in a specific database, storage facility or any other service. Any microservice (or user) wanting to do anything with this data must go via the gatekeeper. This gives me a single location where I can ensure authorization and validity of the request, ensure consistency and log actions and create an audit trail for all actions related to this data. Typically, I would only do this for sensitive data such as personal information or financial transactions.
To create a Gatekeeper you simply assign it a role with access to a specific DynamoDB table, S3 bucket or some other service, and make sure no other microservice is given access to it. Any other microservice that wants to perform any activity with this data must retrieve it from, or pass it to, the gatekeeper microservice. All available logging and monitoring capabilities should be enabled for these functions and every action within your code recorded to Cloudwatch Logs for monitoring.
A trigger microservice gets triggered by actions on other services such as a file being uploaded to S3 or a new entry into a DynamoDB table. The microservice can then perform actions relevant to this such as updating a database with a new file entry, starting a media processing service, synchronization or backups. An example having S3 trigger a lambda microservice can be found here
- Log Monitor
This is a trigger function that is specifically triggered by Cloudwatch Logs. It retrieves the log, parses it and decides what to do with it based on the log content. This could include notifying an admin of critical issues, blocking a user account for suspicious behaviour, or simply displaying an alert in a dashboard. The way I utilize this is by using a console.log wrapper function in a Lambda layer shared “utils.js” file. This wrapper takes a number of params such as user, function, urgency, sensitivity, message and data, and logs this to cloudwatch as a JSON string. My log monitor microservice recognizes this as a log, parses it to an object then stores it in DynamoDB. If it is critical, it will publish a message to an SNS topic that the admins and developers can subscribe to. If it is marked as sensitive, then I can store it in a separate database just for this type of data. When its released, I’m hoping to use QLDB to store immutable logs of certain activities so they can not be deleted by a potential attacker.
Like the classic “CRON” application on Linux servers, this is a microservice that can be configured to run at intervals or a specific date and/or time using Cloudwatch Events. Typically this type of microservice would be used for cleanup functions, log parsers, auto-notifications, etc. More about setting this up here
A Microservice specifically for managing the connect, event and disconnect of API Gateway websockets.
Which type to use for a particular microservice comes down to the requirements. Do you need a fast response to the user, do you need a high degree of resiliency or is security the highest priority for example. As mentioned, for the decoupled services, unless they are purely background processes, it will be really important to be able to push information from your backend to your frontend. This means either push notifications for mobile apps, or websockets for a web application.
Step Functions is also relevant here but deserves an article of its own. Step functions enable you to design an asynchronous fault-tolerant pipeline of Lambda functions. You can read more about this service here
Did I miss any microservice types that you believe are essential? - let me know!