Online fraud has a widespread impact on businesses and requires an effective end-to-end strategy to detect and prevent new account fraud and account takeovers, and stop suspicious payment transactions. Detecting fraud closer to the time of fraud occurrence is key to the success of a fraud detection and prevention system. The system should be able to detect fraud as effectively as possible also alert the end-user as quickly as possible. The user can then choose to take action to prevent further abuse.
In this post, we show a serverless approach to detect online transaction fraud in near-real time. We show how you can apply this approach to various data streaming and event-driven architectures, depending on the desired outcome and actions to take to prevent fraud (such as alert the user about the fraud or flag the transaction for additional review).
This post implements three architectures:
To detect fraudulent transactions, we use Amazon Fraud Detector, a fully managed service enabling you to identify potentially fraudulent activities and catch more online fraud faster. To build an Amazon Fraud Detector model based on past data, refer to Detect online transaction fraud with new Amazon Fraud Detector features. You can also use Amazon SageMaker to train a proprietary fraud detection model. For more information, refer to Train fraudulent payment detection with Amazon SageMaker.
This architecture uses Lambda and Step Functions to enable real-time Kinesis data stream data inspection and fraud detection and prevention using Amazon Fraud Detector. The same architecture applies if you use Amazon Managed Streaming for Apache Kafka (Amazon MSK) as a data streaming service. This pattern can be useful for real-time fraud detection, notification, and potential prevention. Example use cases for this could be payment processing or high-volume account creation. The following diagram illustrates the solution architecture.
The flow of the process in this implementation is as follows:
This approach allows you to react to the potentially fraudulent transactions in real time as you store each transaction in a database and inspect it before processing further. In actual implementation, you may replace the notification step for additional review with an action that is specific to your business process—for example, inspect the transaction using some other fraud detection model, or conduct a manual review.
Sometimes, you may need to flag potentially fraudulent data but still process it; for example, when you’re storing the transactions for further analytics and collecting more data for constantly tuning the fraud detection model. An example use case is claims processing. During claims processing, you collect all the claims documents and then run them through a fraud detection system. A decision to process or reject a claim is then made—not necessarily in real time. In such cases, streaming data enrichment may fit your use case better.
This architecture uses Lambda to enable real-time Kinesis Data Firehose data enrichment using Amazon Fraud Detector and Kinesis Data Firehose data transformation.
This approach doesn’t implement fraud prevention steps. We deliver enriched data to an Amazon Simple Storage Service (Amazon S3) bucket. Downstream services that consume the data can use the fraud detection results in their business logics and act accordingly. The following diagram illustrates this architecture.
The flow of the process in this implementation is as follows:
As a result, we have data in the S3 bucket that includes not only original data but also the Amazon Fraud Detector response as metadata for each of the transactions. You can use this metadata in your data analytics solutions, machine learning model training tasks, or visualizations and dashboards that consume transaction data.
Not all data comes into your system as a stream. However, in cases of event-driven architectures, you still can follow a similar approach.
This architecture uses Step Functions to enable real-time EventBridge event inspection and fraud detection/prevention using Amazon Fraud Detector. It doesn’t stop processing of the potentially fraudulent transaction, rather it flags the transaction for an additional review. We publish enriched transactions to an event bus that differs from the one that raw event data is being published to. This way, consumers of the data can be sure that all events include fraud detection results as metadata. The consumers can then inspect the metadata and apply their own rules based on the metadata. For example, in an event-driven ecommerce application, a consumer can choose to not process the order if this transaction is predicted to be fraudulent. This architecture pattern can also be useful for detecting and preventing fraud in new account creation or during account profile changes (like changing your address, phone number, or credit card on file in your account profile). The following diagram illustrates the solution architecture.
The flow of the process in this implementation is as follows:
As in the Kinesis Data Firehose data enrichment method, this architecture doesn’t prevent fraudulent data from reaching the next step. It adds fraud detection metadata to the original event and sends notifications about potentially fraudulent transactions. It may be that consumers of the enriched data don’t include business logics that use fraud detection metadata in their decisions. In that case, you can change the Step Functions workflow so it doesn’t put such transactions to the destination bus and routes them to a separate event bus to be consumed by a separate suspicious transactions processing application.
For each of the architectures described in this post, you can find AWS Serverless Application Model (AWS SAM) templates, deployment, and testing instructions in the sample repository.
This post walked through different methods to implement a real-time fraud detection and prevention solution using Amazon Machine Learning services and serverless architectures. These solutions allow you to detect fraud closer to the time of fraud occurrence and act on it as quickly as possible. The flexibility of the implementation using Step Functions allows you to react in a way that is most appropriate for the situation and also adjust prevention steps with minimal code changes.
For more serverless learning resources, visit Serverless Land.
Veda Raman is a Senior Specialist Solutions Architect for machine learning based in Maryland. Veda works with customers to help them architect efficient, secure and scalable machine learning applications. Veda is interested in helping customers leverage serverless technologies for Machine learning.
Giedrius Praspaliauskas is a Senior Specialist Solutions Architect for serverless based in California. Giedrius works with customers to help them leverage serverless services to build scalable, fault-tolerant, high-performing, cost-effective applications.