Welcome!

Java IoT Authors: Elizabeth White, Pat Romanski, Liz McMillan, Yeshim Deniz, Zakia Bouachraoui

Related Topics: @CloudExpo, Java IoT, @DXWorldExpo

@CloudExpo: Blog Post

Test Data Management and the Cloud By @EFeatherston | @CloudExpo #Cloud

The goal of any good test data management process is to provide consistent, repeatable test data across systems and environments

Test Data Management and the Cloud - Keeping All the Plates Spinning

I was recently in Boston at Faneuil Hall Marketplace, and with the long-awaited warm weather, all the street entertainers were in full force. Singers, musicians, and a variety of juggling acts filled the street, with crowds surrounding them. One act in particular struck a chord with me - the classic spinning plates. We've all seen it at various times in our lives. The entertainer started spinning plates, balanced precariously on top of wooden sticks. More and more plates started spinning with the entertainer frantically running back and forth as one started to slow down, almost fall, but, just in time, was able to get it spinning and balanced again. Then the entertainer reached his limit: he add one more plate and as he tried to keep them all up, one lone plate, down on the end started to wobble, the stick tilting, and before the entertainer could reach it, the plate went crashing to the ground, taking several of the other plates with it.

For those who have been responsible for test data management in a large, complex, integrated environment, they can probably relate to the spinning plates challenge. Any Quality Assurance tester or developer responsible for chasing down a Severity 1 blocking bug, only to find the issue was not the code, but a flaw in the test data, can also relate. Identifying, configuring, deploying, and maintaining a valid set of test data remains one of the technology challenges that is the bane of many a technologist. How does the cloud impact this? Does it make it better, worse, or more of the same?

Why is test data management so hard?
The goal of any good test data management process is to provide consistent, repeatable test data across your systems and environments, whether it be development, QA, or performance. Ideally, it would be wonderful to have reusable test data sets to leverage across all environments. This would provide consistency, as well as resource and time savings. Sounds basic enough, so what makes it so hard?

There are multiple challenges:

  • Avoiding data collisions: For complex systems that integrate with other systems, test environments and systems tend to be shared due to cost and resource constraints. There are other applications testing against that same system you are integrating with. Coordinating data sets to ensure no other application under test is accidentally using and overwriting data you are using can be challenging and addressed. There is nothing worse than chasing what appears to be a bug that actually turns out to be someone else overwriting your test data.
  • Enforcing privacy rules: A common and useful practice is to mine and extract test data from production systems. The key consideration here is any privacy and compliance rules (such as HIPAA). This may require the masking of test data. Masking itself may then introduce other challenges. A simple example: part of your test data is a customer's name and address, which you need to mask. What if your system does address validation, to ensure all addresses are valid? You could easily create an address that now fails basic validation.
  • Ensuring relational integrity across systems: If you are integrating data sets across multiple systems, you may need to ensure you are maintaining the relational integrity of you data across those systems. The masking mentioned above can add to complications of that process that need to also be considered.
  • Resetting data set to a clean starting point: This means you need to understand any changes your testing did across all the integrated systems in order to be sure those changes can be backed out and/or removed back to a known starting point. Changes propagated across environments can be a key source of unintended consequences in a test environment.

How does the cloud impact all this?
All of the previous challenges discussed still exist when you move to the cloud environment. One of my favorite mantras is ‘no technology negates the need for good design and planning.' Cloud doesn't provide any magic; it's just a tool. It can help in standing up standard repeatable test environments, but the data setup process is still subject to the challenges already discussed.

Additionally, going to the cloud may introduce other challenges that must be considered:

  • SaaS solutions: In SaaS environments, you may not have direct access to the database layer. You are constrained to the mechanisms provided by the SaaS vendors for the extraction and the import of your user data, content, and configuration information. Your test data management process needs to take this into account.
  • Network bandwidth: If part or all of your environments reside in the cloud, you need to take into account the network when doing data loads, especially if you are dealing with large volumes of data either for performance testing or analytics. Bandwidth is usually well thought out for daily operational traffic, but frequently forgotten for initial and test data loads.

Keeping all those plates spinning is no easy task
Test Data Management has always been a challenge. Going to the cloud does not make it any easier. In fact, it adds some additional plates you need to keep spinning in order to ensure successful testing of your applications. As technologists, it's important to be sure we know which plates we need, and keep them close so we can keep them spinning. With good design and planning, there is no reason to think the test data management plates are going to come crashing to the ground.

This post is brought to you by The CIO Agenda.

KPMG LLP is a Delaware limited liability partnership and is the U.S. member firm of the KPMG network of independent member firms affiliated with KPMG International Cooperative ("KPMG International"), a Swiss entity. The KPMG name, logo and "cutting through complexity" are registered trademarks or trademarks of KPMG International. The views and opinions expressed herein are those of the authors and do not necessarily represent the views and opinions of KPMG LLP.

More Stories By Ed Featherston

Ed Featherston is VP, Principal Architect at Cloud Technology Partners. He brings 35 years of technology experience in designing, building, and implementing large complex solutions. He has significant expertise in systems integration, Internet/intranet, and cloud technologies. He has delivered projects in various industries, including financial services, pharmacy, government and retail.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


IoT & Smart Cities Stories
SYS-CON Events announced today that DatacenterDynamics has been named “Media Sponsor” of SYS-CON's 18th International Cloud Expo, which will take place on June 7–9, 2016, at the Javits Center in New York City, NY. DatacenterDynamics is a brand of DCD Group, a global B2B media and publishing company that develops products to help senior professionals in the world's most ICT dependent organizations make risk-based infrastructure and capacity decisions.
A valuable conference experience generates new contacts, sales leads, potential strategic partners and potential investors; helps gather competitive intelligence and even provides inspiration for new products and services. Conference Guru works with conference organizers to pass great deals to great conferences, helping you discover new conferences and increase your return on investment.
DXWorldEXPO LLC announced today that ICOHOLDER named "Media Sponsor" of Miami Blockchain Event by FinTechEXPO. ICOHOLDER gives detailed information and help the community to invest in the trusty projects. Miami Blockchain Event by FinTechEXPO has opened its Call for Papers. The two-day event will present 20 top Blockchain experts. All speaking inquiries which covers the following information can be submitted by email to [email protected] Miami Blockchain Event by FinTechEXPOalso offers sp...
Headquartered in Plainsboro, NJ, Synametrics Technologies has provided IT professionals and computer systems developers since 1997. Based on the success of their initial product offerings (WinSQL and DeltaCopy), the company continues to create and hone innovative products that help its customers get more from their computer applications, databases and infrastructure. To date, over one million users around the world have chosen Synametrics solutions to help power their accelerated business or per...
Poor data quality and analytics drive down business value. In fact, Gartner estimated that the average financial impact of poor data quality on organizations is $9.7 million per year. But bad data is much more than a cost center. By eroding trust in information, analytics and the business decisions based on these, it is a serious impediment to digital transformation.
@DevOpsSummit at Cloud Expo, taking place November 12-13 in New York City, NY, is co-located with 22nd international CloudEXPO | first international DXWorldEXPO and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time t...
When talking IoT we often focus on the devices, the sensors, the hardware itself. The new smart appliances, the new smart or self-driving cars (which are amalgamations of many ‘things'). When we are looking at the world of IoT, we should take a step back, look at the big picture. What value are these devices providing. IoT is not about the devices, its about the data consumed and generated. The devices are tools, mechanisms, conduits. This paper discusses the considerations when dealing with the...
SYS-CON Events announced today that IoT Global Network has been named “Media Sponsor” of SYS-CON's @ThingsExpo, which will take place on June 6–8, 2017, at the Javits Center in New York City, NY. The IoT Global Network is a platform where you can connect with industry experts and network across the IoT community to build the successful IoT business of the future.
IoT is rapidly becoming mainstream as more and more investments are made into the platforms and technology. As this movement continues to expand and gain momentum it creates a massive wall of noise that can be difficult to sift through. Unfortunately, this inevitably makes IoT less approachable for people to get started with and can hamper efforts to integrate this key technology into your own portfolio. There are so many connected products already in place today with many hundreds more on the h...
The best way to leverage your Cloud Expo presence as a sponsor and exhibitor is to plan your news announcements around our events. The press covering Cloud Expo and @ThingsExpo will have access to these releases and will amplify your news announcements. More than two dozen Cloud companies either set deals at our shows or have announced their mergers and acquisitions at Cloud Expo. Product announcements during our show provide your company with the most reach through our targeted audiences.