Python saves the day – Test data generation and data evaluation using Python

Data! In many cases it becomes an increasingly larger part of our testing. The need to evaluate it or the need to create it. 5 years ago when I did an internship at a company that creates a popular tool for data visualization and analysis, the limit for “big data” had risen to 1 million rows.. As testers we constantly need to create or evaluate large amounts of data, and manually that quickly becomes a tedious or impossible task. Of course you can rely on samples and of course you can rely on small amounts of test data. You may even be lucky enough to get real data from a production environment. At several times I have found myself being in a situation where I want to evaluate, or create thousands of data posts. My go to language for doing this is Python. I’ve tried others but I always return to Python. It has served me well. In this talk I will give examples how I’ve done this.

Like when I made an elaborate script to insert customers in a database for use as test data. The system was under development and with each release the script had to be tweaked since there were new keys, new columns or new relations to take into consideration. I will talk about how I have compared hundreds of thousands of data rows to evaluate two different imports of traffic data, a task that was among the hardest I’ve ever done and that, even though it occasionally required a lot of hands on, manual evaluation, down to the level where we had to reconstruct paper timetables to evaluate some data, had been a lot more tedious if I had not been able to rely on a trusty old Python script for the initial evaluation.

Today we often have discussions if testers should learn to code or not. This talk will not take up the pros or cons of coding testers, because obviously I do a bit of coding. It is however, a relatively basic speech, and hopefully it will inspire some testers who has newly started coding, or even some who does not code, to at least try it out.

The day after:
The day after, attendees will hopefully be inspired about the use of Python when it comes to data generation and data evaluation in testing. In the talk they will have seen some simple examples of how both evaluation and creation can be done and maybe they have already checked out a tutorial or two that was provided in the speech?