Serverless Per-User Costing

With server-based solutions, the only way you can really get accurate per-user costs is to give each user their own server, which is usually not practical and results in a lot of wastage. With serverless, you can log individual requests, compute time, storage and data transfer, calculate the associated cost and link them back to specific users using built-in AWS capabilities, some code and a few tricks.

Once you have this data collected, you can track and bill your users based on actual usage which opens up new possibilities for business model innovation.

General Requirements

The below assumes your users are using API Keys or Cognito to access API Gateway, or IAM key/secret to access services directly. Either way, you will need some searchable facility that has all of your users and their ID’s. These would be the same ID’s that appear in logs for the different services.

You will need a Lambda for parsing CloudWatch logs, this can be done monthly or you can configure it to be triggered each time a log is added to CloudWatch. And you will need a Lambda for parsing S3 logs, again this could be triggered or timed.

Storage (S3)

S3 Storage is a relatively simple one. This assumes that your users are uploading or creating files to be stored in the system. Most likely these files are:

passing through a Lambda function - write a log to CloudWatch including the filename, the owner and the size of the file
being uploaded via a signed URL - ensure S3 logging is enabled, then parse the logs with a lambda
or being added to the s3 bucket via the AWS Command Line Interface (CLI) - ensure S3 logging is enabled, then parse the logs with a lambda

The advantage of using s3 logs over manually writing a lambda log is that S3 logs will also include state changes. So if you have a lifecycle policy that moves files to deep storage or deletes files, then you will be able to find when that happens in your s3 logs. There is currently no trigger available for this behaviour. The disadvantage of S3 logs is that they are not guaranteed and there is a varying delay between action and the log being written. I have not come across issues so far, but if you need a guaranteed watertight solution, you could combine the logs with a scan of the actual files in s3 every couple of months to adjust for any discrepancies.

For tips on parsing S3 logs, see the end of the article

Lambda Compute

There are 2 parts to tracking and costing Lambda Compute time. Lambda will log a report at the end of each execution that contains the duration that will be billed for that execution. You need to retrieve this log which can be done with a CloudWatch trigger to a Lambda function. The problem is that this report contains no information about WHO executed the function so you need to make the link. There are 2 ways of doing this:

In the Lambda the context gives you access to the request ID. This request ID is also present in the report. Each time the function is called you could write the request ID to a database, then have a trigger lambda add the duration from the report. This costs a bit of Database traffic however and may also delay your function execution.
The approach that I take, is to find the fastest way to identify the user in the lambda, then write an “identification” log containing that identity. This can be the API key from the header or the Cognito user SUB from the authorization token for example. Note that you should not write your users’ personal information such as email to the log. Once the identity information is there, the trigger Lambda can connect the ID with the report and store that with your costing information.

Once you have the duration, you can multiply that by the AWS cost for Lambda compute in your region. Note that this differs depending on the amount of memory you have assigned to your function, also note that the cost is per 100ms. Lambda request quantity is also a cost factor. Once you have linked the report with a user ID, calculating the requests is simply a matter of counting the number of reports for that user.

Technical Tips

Getting the user SUB from a Cognito authorization header

let sections = authorization.split('.');
let buffc = Buffer.from(sections[1], 'base64');
claims = JSON.parse(buffc.toString('ascii'));

The user SUB can be found in claims.sub.

A NodeJS Lambda snippet for decoding the CloudWatch log from the trigger event:

const zlib = require('zlib');
let payload = Buffer.from(event.awslogs.data, 'base64');
let unzipped = zlib.unzipSync(payload).toString();
logevents = JSON.parse(unzipped).logEvents;

And checking if it’s a report log:

let isreport = logevents[l].message.substr(0,6)  === 'REPORT';

Split the log into parts

let parts = logevents[l].message.trim().split("\t");

The request ID

requestid = parts[0].substring(18);

Creating the full log in a usable format

let rlog = {"report":true,"requestid":requestid};
for(let p in parts){
  if(!parts.hasOwnProperty(p)) continue;
  if(p == 0) continue;

  // report key/values
  let q = parts[p].trim().split(':');
  if(typeof q[1] == 'undefined') continue;

  // key
  let k = q[0].replace(/\s/g,'_').toLowerCase();

  // value
  rlog[k] = parseInt(q[1].substring(0,q[1].length-2).trim());
}

After that, you can store it on s3 as JSON or in DynamoDB.

Data Transfer

Using s3 logs, you can track downloads of files via signed URLs and API key/secret requests for files using AWS CLI. Enable S3 logs on your bucket, then parse them with a Lambda (see below for tips on this).

Tracking transfer of public files from CloudFront can be a bit trickier and not something I have tried yet. This could be achieved with some client-side code although it won't be watertight.

You can enable logs in API Gateway to track requests and all transfers passed through there. These are stored as CloudWatch logs that can trigger and be parsed by a Lambda function. For authorized API calls, the logs will include authorization information that you can use to link the cost back to a user. For non-authorized requests, you will need to include your own parameters in the request to achieve that. See below for tips on this.

Parsing S3 Logs

This had some pitfalls so I wanted to share some tips to help with this. This is a mix of instruction and Lambda NodeJS snippets.

Columns that we can extract from the log

const columns = [
  'Bucket_Owner', 'Bucket', 'Time', 'Remote_IP', 'Requester', 'Request_ID',
  'Operation', 'Key', 'Request_URI', 'HTTP_status', 'Error_Code', 'Bytes_Sent',
  'Object_Size', 'Total_Time', 'Turn_Around_Time', 'Referrer', 'User_Agent', 'Version_Id'
];

First, retrieve your S3 logs, you can read them all into memory, read them one by one or stream them, whichever suits your use case (mostly driven by the number of files you are typically processing in one sitting).

Whichever method you prefer, you will need to read in a single log’s data then parse it:

  // split and loop through rows
  let rows = s3_log_data.split("\n");
  for(let r in rows){
  if(!rows.hasOwnProperty(r)) continue;

  // parse the row (function is below)
  let cols = getDataFromCSVLine(rows[r]);

  // skip if invalid
  if(cols.length < 10) continue;

  // skip if not relevant action, this will depend on your usecase, i only needed get requests
  if(cols[7] !== 'REST.GET.OBJECT') continue;

  // fix date field, it’s annoyingly spread over 2 columns, merge into the first, delete the 2nd
  cols[2] = cols[2].substring(1)+' '+cols[3].substring(0, cols[3].length-1);
  cols.splice(3,1);

  // create timestamp, because thats more useful to work with
  let dp = cols[2].split(' ')[0]; // 0: date+time, 1: offset
  dp = dp.split(':'); // 0: date, 1:hour, 2: minute, 3:second
  let dd  = dp[0].split('/'); // day, month, year
  cols[2] = new Date(dd[2],months.indexOf(dd[1]),dd[0],dp[1],dp[2],dp[3]).getTime();

  // create nice key-value object log
  let newLog = {};
    for(let c in cols){
      if(!cols.hasOwnProperty(c)) continue;

      // the last couple of columns don't seem to fit into the column names, but i didn't need them so I just made this quick solution
      if(typeof columns[c] === 'undefined'){
  columns[c] = 'column '+c;
      }

      newLog[columns[c]] = cols[c];
    }

    // you can then add "newlog" to a collector object or straight to your db or json file
  }

This is a simple CSV-parsing function to parse the S3 log row

function getDataFromCSVLine(line) {
  let dataArray = [];
  let tempString="";
  let lineLength = line.length;
  let index=0;

  while(index &lt; lineLength) {
    if(line[index]=='"') {
  let index2 = index+1;
  while(line[index2]!='"') {
        tempString+=line[index2];
        index2++;
      }
      dataArray.push(tempString);
      tempString = "";
      index = index2+2;
      continue;
    }

    if(line[index]!=" ") {
  tempString += line[index];
  index++; continue;
    }

    if(line[index]==" ") {
  dataArray.push(tempString);
  tempString = "";
  index++;continue;
    }
  }

  dataArray.push(tempString);
  return dataArray;
}

Parsing API Gateway Logs

I recently implemented this and it had some pitfalls so here are some tips to process these files

To set API Gateway up for logging, create a role that API Gateway can use to write logs to cloudwatch (you can just pick the ready made options for this when adding a new role). Then, enable cloudwatch logs in your stage configuration. Log level should be "info" and check "Log full requests/responses data".

I then created a Lambda microservice (nodejs) that retrieves the cloudwatch logs to process them. This could also be a trigger but I found a daily batch approach more suited to my usecase. API gateway creates a whole bunch of streams with hash names for writing its logs too. This does not seem to be like Lambda logs with time-based naming. It also seems a bit random which streams get written to. There are a few steps to go through to parse the API Gateway logs into something useful.

Get all of the streams for the API Gateway log group that was created using describeLogStreams. Make sure to configure this request to orderBy 'LastEventTime' and descending param should be true. This means that the response will be first the stream with the most recent log, then older logs until it gets to streams that have no logs. This may need to be a looping cloudwatch request if a nextToken is returned.
Check each returned stream to create an "active stream" array. If data.logStreams[s].lastEventTimestamp is undefined, then it does not have any logs. Note that because of the orderby config that we are using, once we get to a stream that has no logs we can stop looping through the streams as nothing after it will have any logs either.
Loop through the active streams using getLogEvents to retrieve all of the logs in the stream. As with streams, this may need to be a looping request until no more nextToken is returned.
Parse each log message and pull out the useful information (some code notes below).
Store the logs in daily segments so they can then be included with a log aggregator later on

Parsing the logs themselves can be a bit tricky. Each message consists of a timestamp and a message param. The timestamp is valid and can be used as-is. How to parse the message will depend on what type of message it is.

Getting the log ID from the message

  let logID = /\(([^)]+)\)/.exec(message)[1];
  // note: should error check logID to make sure its not undefined

Getting the HTTP method and path

  let search = 'HTTP Method: ';
  let searchIndex = message.search(search);
  let message = let m = message.substring(searchIndex+search.length).trim();

  let parts = message.split(',');
  let method = parts[0];
  let path   = parts[1].substring(15);

Most other data will likely be JSON or JSON-like, some fixes are needed to be able to read most messages but all of this will need to be tweaked for your configuration (user input/json schema/API Gateway transforms/etc.)

  // getting a cognito sub so you can link this cost to a user
  let search = 'Cognito User Pool Authorizer claims from JWT: ';

  // Get the API key, this is censored in the actual request but if you transform input for your lambda function you can get it from there
  let search = 'Endpoint request body after transformations: ';

  // Get the response size, this is the data transferred to the user that has a cost attached to it
  let search = 'Endpoint response headers: ';

  let searchIndex = message.search(search);
  let message = let m = message.substring(searchIndex+search.length).trim();


  // fix truncated json (the details of this are dependent on how you have configured json input coming in from the user
  if(message.search('[TRUNCATED]') !== -1){
    message = message.replace('[TRUNCATED]','"}}');
  }

  // result
  let result = {};


  // try this, it will work for some messages
  try{
    result = JSON.parse(message.replace(/\\n/,''));

  // else we need to clean some more
  }catch(e){

    // disclaimer: there is probably a really elegant regex for this but I was working on a rapid prototype so took the quick and easy option!

    // cut brackets from string
    message = message.substring(1,message.length-1);

    // add quotes and the brackets back on
    message = '{"'+message.replace(/=/g,'":"').replace(/, /g,'", "')+'"}';

    // fix dates
    message =  message.replace('"Date":"Mon", "','"Date":"Mon, ')
        .replace('"Date":"Tue", "','"Date":"Tue, ')
        .replace('"Date":"Wed", "','"Date":"Wed, ')
        .replace('"Date":"Thu", "','"Date":"Thu, ')
        .replace('"Date":"Fri", "','"Date":"Fri, ')
        .replace('"Date":"Sat", "','"Date":"Sat, ')
        .replace('"Date":"Sun", "','"Date":"Sun, ');

    // fix X-Amzn-Trace-Id=root=
    message = message.replace('root":"','root:').replace(';sampled":"',';sampled:');


      // try again
      try{
        result = JSON.parse(message);
      }catch(e){
        // debug, figure out the issue, add more fixes, repeat until no more error
        console.log('COULD NOT PARSE: ',message);
        return false;
      }
  }