Overview

While I was writing How to Unit Test Kafka Streams Applications, another idea came into my mind. I thought about how to leverage those unit testing approaches into more of a case study in our daily work. How about creating a Behavior Driven Development (BDD) instead of conventional unit testing? What is the proper user story for this kind of use case? And the idea rolled on until I decided to create a case study for this article.

The outline is pretty straightforward. At first, I will create a user story as my requirement. Then I will translate the user story into a Gerkhin file. Because I choose the TDD approach, I will map the Gerkhin feature file into a testing file. I select Cucumber with JUnit runner to implement the testing files. And the last step will be to develop fraud detection by using Kafka Streams.

User Story

“User stories are part of an agile approach that helps shift the focus from writing about requirements to talking about them. All agile user stories include a written sentence or two and, more importantly, a series of conversations about the desired functionality” – Mike Cohn

To start with, we need a good user story.

What constitutes a good user story? A user story itself is a requirement. And a good requirement should be atomic and consist of:

Actor: who will be the user of the requirement
Rationale: why we need the requirement and to some degree, we can understand whether the requirement is worthy or not.
Acceptance Criteria: we know how to measure the requirement and make it testable.

Though all of these three are important, as a software engineer, we need to focus on the last parts. There are two reasons why we should take more attention to the acceptance criteria. First, based on my experience, it is the most understatement part of the user story. Product owners usually only focus on the actor and rationale part. They sometimes do not know how to measure their requirements. And the second part, you need a measurable acceptance criteria to create a test scenario. Otherwise, how can we test the requirement already fulfill or not? The borderline is: if you cannot measure the requirement, it is not a requirement.

I inspired a lot about what is a good requirement from Mastering Requirement Process book by Suzanne Robertson and James Robertson. If you have chance to look at the book, please take it.

Revisit User Story

Now you may be wondering, why do I talk about user story and/or requirement a lot. There is a reason behind it. Yes, to map user stories into Gerkhin format, we need a good user story. Now let us revisit our user story with some highlights.

As you can see, I have pointed out the actor, the rationale and the assessment of acceptance criteria in order to make it testable. With all the measurements in place, now we can move to create a Gerkhin feature file for this story.

Of course, we still need non-functional requirements to be translated into stories. Usually, I came by initiating an EPIC for a specific requirement and creating multiple stories with functional and non-functional requirements, and also some constraints if necessary. But this is over the line for the topic.

Gerkhin

Because there are three acceptance criteria, we will create three Scenario Outlines. But this one-to-one mapping is not compulsory. You can merge into one scenario outline or split into more than three scenario outlines. It depends on how is your approach to creating the Gerkhin file.

If you notice, there are at least four variables that are needed to create the scenario:

The customer, in this case, is represented by a credit card number.
Transaction amount.
Transaction date, in this case, we call it event date.
Suspicious amount.

Single Transaction

Let start from the simplest one. Those four variables are enough for this scenario.

	Feature: Fraud Detection

	@single
	Scenario Outline: Single transaction scenario: amount above $1000 then notify
	Given Customer has a credit card with account number "<cc>"
	When Customer transacts $<amount> at "<event>"
	Then Fraud flag is "<flag>"
	And total suspicious amount is $<total>
	Examples:
	\|cc\|amount\|event\|flag\|total\|
	\|4567-8901-2345-6789\|500.50\|2020-12-10T13:50:40Z\|\|0\|
	\|4567-8901-2345-6789\|1000\|2020-12-10T13:50:40Z\|\|0\|
	\|4567-8901-2345-6789\|1000.50\|2020-12-10T13:50:40Z\|Y\|1000.50\|
	\|4567-8901-2345-6789\|1500\|2020-12-10T13:50:40Z\|Y\|1500\|

view raw FraudDetectionSingle.feature hosted with ❤ by GitHub

5-minute Interval

Now it is getting tricky. We need to simulate multiple transactions. Therefore, I will add variables to cover the needs.

	@hopping
	Scenario Outline: Accumulative transaction during 5 minutes interval: amount above $1500 then notify
	Given Customer has a credit card with account number "<cc>"
	When Customer transacts $<amount1> at "<event1>"
	And Customer transacts $<amount2> at "<event2>"
	And Customer transacts $<amount3> at "<event3>"
	Then Fraud flag is "<flag>"
	And total suspicious amount is $<total>
	Examples:
	\|cc\|amount1\|amount2\|amount3\|event1\|event2\|event3\|flag\|total\|
	\|4567-8901-2345-6789\|100.50\|500.50\|500.0\|2020-12-10T13:00:00Z\|2020-12-10T13:01:00Z\|2020-12-10T13:05:00Z\|\|0\|
	\|4567-8901-2345-6789\|500.50\|500.50\|500.0\|2020-12-10T13:00:00Z\|2020-12-10T13:02:00Z\|2020-12-10T13:04:00Z\|Y\|1501.0\|
	\|4567-8901-2345-6789\|500.50\|500.50\|500.0\|2020-12-10T13:00:00Z\|2020-12-10T13:01:00Z\|2020-12-10T13:06:00Z\|\|0\|

view raw FraudDetectionHopping.feature hosted with ❤ by GitHub

1-hour Inactivity Gap

Similar with previous scenario, in order to simulate more multiple transactions, I need to add more variables.

	@session
	Scenario Outline: Accumulative transaction in a single period of activity with 1 hour inactivity gap: amount above $4000 then notify
	Given Customer has a credit card with account number "<cc>"
	When Customer transacts $<amount1> at "<event1>"
	And Customer transacts $<amount2> at "<event2>"
	And Customer transacts $<amount3> at "<event3>"
	And Customer transacts $<amount4> at "<event4>"
	And Customer transacts $<amount5> at "<event5>"
	Then Fraud flag is "<flag>"
	And total suspicious amount is $<total>
	Examples:
	\|cc\|amount1\|amount2\|amount3\|amount4\|amount5\|event1\|event2\|event3\|event4\|event5\|flag\|total\|
	\|4567-8901-2345-6789\|1000\|1000\|1000\|1000\|0\|2020-12-10T13:00:00Z\|2020-12-10T13:10:00Z\|2020-12-10T13:30:00Z\|2020-12-10T13:40:00Z\|2020-12-10T13:50:00Z\|\|0\|
	\|4567-8901-2345-6789\|1000\|1000\|1000\|1000\|1000\|2020-12-10T13:00:00Z\|2020-12-10T13:30:00Z\|2020-12-10T13:30:00Z\|2020-12-10T13:40:00Z\|2020-12-10T13:50:00Z\|Y\|5000\|
	\|4567-8901-2345-6789\|1000\|1000\|1000\|1000\|1000\|2020-12-10T13:00:00Z\|2020-12-10T13:30:00Z\|2020-12-10T13:30:00Z\|2020-12-10T13:40:00Z\|2020-12-10T14:30:00Z\|Y\|5000\|
	\|4567-8901-2345-6789\|1000\|1000\|1000\|1000\|1000\|2020-12-10T13:00:00Z\|2020-12-10T14:01:00Z\|2020-12-10T14:30:00Z\|2020-12-10T14:40:00Z\|2020-12-10T14:50:00Z\|\|0\|
	# this will set flag to Y because amount to is $1500. it means breaking the first scenario then we still need to informed the suspicious transaction.
	\|4567-8901-2345-6789\|1000\|1500\|1000\|1000\|0\|2020-12-10T13:00:00Z\|2020-12-10T14:01:00Z\|2020-12-10T14:30:00Z\|2020-12-10T14:40:00Z\|2020-12-10T14:50:00Z\|Y\|1500\|
	# this will set flag to Y because event2 and event3 break hopping windows scenario
	\|4567-8901-2345-6789\|1000\|1000\|1000\|1000\|0\|2020-12-10T13:00:00Z\|2020-12-10T13:10:00Z\|2020-12-10T13:11:00Z\|2020-12-10T13:40:00Z\|2020-12-10T13:50:00Z\|Y\|2000\|

view raw FraudDetectionSession.feature hosted with ❤ by GitHub

Note: To be honest, I would be happy if the Examples part is more dynamic to have multiple values for a single variable. Maybe you can help me to solve this issue?

To Be Continue

Maybe all of you have acknowledged that I did not mention any system, any architecture, or any integration by far. Or where should I deploy the system? Is it machine learning or just stream technology? Or is it just a monolith back-end application? Yes, you are right. I did it on purpose. Why? Because the requirement does not provide solutions.

But I know you want to have the solutions and the implementation. Because in the end, it is all matters. And it is you, as a development team, who provide the solution. Am I right? Y.E.S I.A.M. So let’s go to part 2.

CLICK HERE TO PART 2.

Author: ru rocker

I have been a professional software developer since 2004. Java, Python, NodeJS, and Go-lang are my favorite programming languages. I also have an interest in DevOps. I hold professional certifications: SCJP, SCWCD, PSM 1, AWS Solution Architect Associate, and AWS Solution Architect Professional. View all posts by ru rocker

A Study Case: Building A Simple Credit Card Fraud Detection System – Part 1: From User Story to Gerkhin Feature File