Test Data Management

Software Testing

Sastri Munsamy

19 Aug 2019

Business Problem:

Testing timelines impacted by lack of test data
Requests for test data take too long
Test data is inaccurate and compromises the results of the tests
Execution of automated tests take too long due to the pre-requisite data dependencies and manually capturing the data into the test’s datasheet therefore the automated regression pack cannot be executed daily which results in a longer time to market

Solution:

Test Data Management Model (TDM)

What is TDM?

TDM is an Inspired Testing service offering designed to assist organisations with defining a strategy on how to obtain test data for the various day-to-day testing activities. In some instances, the test data is stored in a separate repository for instant retrieval commonly referred to as a Test Data Warehouse, effectively providing “test data on tap”. End users can request data from an interface and receive the test data for their testing requirements instantly.

Here are some examples of what can be included in your TDM Strategy:

Processes to create or obtain data are automated
- Front End processes
- API processes to facilitate integration of data between systems
- Database insertions, updates and queries
- Using a commercial TDM tool such as IBM Infosphere
Data recycling processes are automated ie. putting the test data back into its original state so that it can be consumed again
Snapshot and restoring of the databases
Automating the aging of test data for specific business processes
State Transitional Analysis of the test data through various processes that consume the test data

Sometimes, there are no processes that can be automated to create test data and you are forced to use the limited amount of test data that has been provided to you by the respective system’s teams. In such instances, we will analyse the state transition of the test data as it is consumed through the execution of the test to identify the most efficient way to sequence your test so that you can use a minimal amount of test data to gain a maximum amount of test coverage. Consider the following example to best illustrate this technique:
Test 1 – Requires test data in State A
Test 3 – Requires test data in State B

Executing Test 3 transitions the test data from State B to State A if executed successfully. Therefore, you must sequence your tests so that Test 3 is executed first followed by Test 1 so that Test 3 puts the data in a state that is ready to be consumed by Test 1.

How does it help Automated Testing?

This is by far one of the most overlooked foundation principles, but it is ultimately one of the most important ones for having effective automation. To justify my statement, imagine the following scenario:

You have five automated scripts that you would like to execute before every release. You have to capture test data manually in a spreadsheet before you can perform the tests.

With the above example, it could take you 10 minutes to capture the data before executing tests which will still provide you with a fast time to market in the short term. Now imagine that you are two years into the future since implementing test automation:

You now have a thousand automated scripts that you need to execute before every release. You need to capture test data manually in a spreadsheet for each of these tests before you can complete them.

As you can see, in the long term and as the automated regression pack grows, it could take months before you release to market and will never be able to execute these tests daily or automatically after hours. Apart from the turnaround time it takes to request test data, the human dependency to capture test data prior to executing tests is massive, and it might be faster to execute the tests manually than through an automated tool due to the ineffective implementation of your automated testing solution.

Automated scripts will create or request test data for each test as defined in the TDM and automatically consume the data in the tests at runtime.

How does it work?

• Identification of test data requirements
- o Interviews with the stakeholders to understand the most frequently requested test data types o Discussions with the Solutions Architect and Developers to identify data creation or extraction processes o Identification of integrated system dependencies to create test data
Automate the creation or extraction of data
- Using ODBC to query Databases for existing data
- Automation of Front-end processes to create or search for new/existing data that is accurate for the testing requirements
- Automation of Integration Layer to create or search for new/existing data
- Store the data in a repository for easy retrieval
Create an interface for data retrieval (Optional)
- End Users will select the data type required
- Data will be instantly available for download from the web interface
- The data will be stored in an Excel file when downloaded
Supporting Processes
- Automatic top up of data when threshold levels drop below a certain level
- Refresh of data repository when test environments are sync’d with Production

The test data management model can be defined to support all your test data needs and not just that of the automated test scripts. You can automate the data requirements for manual functional testers, business users, performance testing and the development team.

Sastri Munsamy

Executive: Technology and Innovation for Inspired Testing

Sastri is a passionate and engaging mentor, educator and speaker with extensive experience of real-world testing and automation projects. He has worked in the consultancy industry for over 17 years. He has implemented test automation on various systems ranging from desktop, Web, SAP and mobile applications in multiple industries across the world, with an emphasis on defining an efficient and profitable automation strategy. As a mentor, Munsamy has hosted testing community meetups in Cape Town and Johannesburg, and has guest spoken at numerous industry events.