Application & Script Independent Framework: The Need For Data Normalization

Test Automation is one of the fastest growing areas and it has been very much successful in attracting the attention of clients, managers etc. Every company spends a considerable amount of money on acquiring the best automation tools and automation experts to make the project successful. On the other hand, world is becoming more and more dynamic and so the requirements. With the evolutions of development models like Agile, SCRUM and RUP, changes in requirements are possible almost at any phase but the do we think the automation frameworks available are good enough to cope up with all these problems. Answer is No. We have to always keep in mind “Requirement Changes are Inevitable”. Most of the automation project ends up with the “Maintenance phase” taking more effort than the automation effort spent initially. Often changes in requirements, in the form of Change Requests / through emails / over phone results into lot of automation maintenance effort. It generally takes 4-5 releases to get ROI on automation, which has a significant impact on the client’s decision to go for automation or not. This process presents the evolutionary automation framework ASIF (Application and Script Independent Framework), which can be followed to solve most of the problems stated above. Framework uses Data Normalization of the test data and objects details (object name, object type and object properties) to minimize the effort of rework whenever there is any change in the requirement or design or business flow. It leverages on various concepts of RDBMS to exploit maximum benefits. Use of database views provides an abstraction and flexibility to the overall architecture. Plug and Play automation library is provided with the framework which can be reused as it is for any other automation project. The key is to make the automation project independent of changes in application \ requirements by enabling automation scripts still continue to work efficiently with least possible maintenance effort. Paper provides the process and guidelines to be followed by Managers and Automation Test Leads. Follow these steps as you staff, tool, or schedule your test automation project, and you will be well on your way to success.


I have worked in many automation projects and have been discussing the automation frameworks used in different companies by different people. Everywhere people claim to use different framework but in the end complain that execution is not smooth, maintenance effort is too much, technical and experienced staff is required, test data changes too often, application is not stable, requirement changes in each releases etc. But don’t our frameworks should answer these questions? Why to use a framework which is inflexible and incapable of handling changes effectively?

The Problems

1. Redundancy: Typically, in the data-driven testing context, different automation consultants create / use test data for their respective functional modules where lot of it might be common with other modules and as a consequence of this maintaining multiple copies of the same data ultimately leads to redundancy. Particularly, in case of updation, all the data pools containing the redundant fields need to be updated.
Data normalization is not present in the data pools as they are stored as excel / CSV / datapool / datatable.

Scenario 1: Consider 500 Data pools being created for 1000 different test cases by different users. In this scenario the users might have unknowingly created many common fields in multiple data pools. If some common objects changes during the next release, the tester needs to revisit all the data pools to change the duplicate objects. This process is time consuming and error prone.

2. Inconsistency: It is generally very difficult to manage a big application that involves millions of test data. This often leads to inconsistency, which is one of the prime reasons for script failures.

Scenario 1:
If the object name, which is used for object identification, is changed or renamed then all the scripts using that object will be aborted as a consequence. Then the tester has to find all the occurrences of the changed object in all the scripts and replace them with the new value, which is a tedious job.

Scenario 2:
If the object type is changed, all the scripts using that object will be aborted as a consequence. Here again, the user has to find all the occurrences of the changed object type in all the scripts and replace them with the new statements, which is very time consuming.

3. Iterative software development: Given today's sophisticated software systems, it is not possible to sequentially define the entire problem at the beginning, design the entire solution, build the software and then test the product at the end. An iterative approach is required that allows for increased understanding of the problem through successive refinements enabling incremental growth resulting in an effective solution over multiple iterations. With every release of a project an executable is produced. As project’s progress and the application grow iteratively, the test suite grows as well.

4. Lack of experienced automation resources: There is a scarcity of automation experts who are technically sound with the testing attitude.

5. Change Requests: Requirements changes dynamically during the later stages of project which affects the scripts automated.

6. Maintenance effort: Generally maintenance effort exceeds the initial effort involved. In the end team members feel it is better to it from scratch rather than doing maintenance. I know many of you must have felt the same while maintaining someone else’s scripts or sometime even your own scripts.

7. Tedious: Automation becomes tedious and difficult to continue when complexity grows with along the functionality. When automation candidates are lengthy and contains too many verification points, it takes a day to automate whereas the same could be done in 30-45 minutes manually.

8. Too many failures: It becomes almost impossible to run the suites uninterrupted as script fails because of small reasons. Unattended playback becomes theoretical. I have often observed people leaving to their home after playing back the suites with the hope to see the test log next day morning but you are right, suite doesn’t continue for long and fails because of some silly mistakes in the code.

9. Difficult to debug: It is often difficult to debug the script written by someone else or even your own script if it done after few days or months of coding. To make a small change, you need to run through 300-400 lines of code if the documentation is not being done.

Solution: Application and script independent framework

ASIF is an automation framework which has been evolved by keeping into mind all the problems mentioned above. It uses the advantages the usage of RDBMS concepts and strongly argues that effort overall effort can be reduced drastically and ROI can be achieved in the first release itself. Another important aspect is that automation doesn’t really need people strong in programming. This method stores all the object properties and test data to be used in a database.

A database schema is designed before creating automation script. Database schema involves inter-dependencies among the data to avoid redundancy. Normalization enables this approach to be much more effective than the traditional approach.

This framework requires the development of base tables, which are independent of the test automation tool used to execute them. These tables are also independent of the common library functions that “drive” the application-under-test. In this case the script will just call the common functions, which will be used to retrieve the data from the database and in turn input the same to the application.

This paper will be organized by the normal steps that we all use for our automation frameworks, making special notes of the considerations and challenges that are particular to test automation:

1. Implementation
2. Advantages
3. Conclusion
4. References

Implementation: ASIF


1. Analysis: Perform size estimation, resource estimation, schedule planning, expected changes and development model. This approach yields maximum benefits especially when the iterative model is used.

2. Identify the automation candidates: Select automation candidates from the entire set of test cases. It is best to automate regression test suites, as they need to be executed for each build, in all the subsequent releases.

3. Identify all the possible screens in the AUT: These are the screens across to be navigated the application to execute the selected test cases.

4. Record object properties for the selected screens: Use tool to capture the object properties like object name, recognition method, and object name etc for the screens within scope.

5. Select an appropriate Database: Make a choice between the databases depending upon the maximum no. of users in team.

6. Design database schema & create base tables for all the selected screens: Design base table, select appropriate data types and field lengths and standard naming convention. Populates tables with the properties captured.

7. Create View by joining required base tables: Design views by joining Base tables, based on the test case flow.

8. Include common library: Common functions are to be used across all the test cases. These functions are independent of the projects and writing them is a one-time activity.

9. Write scripts for test cases: Include Library files and just give calls to common functions in the library. Scripts are restricted to just couple of lines to call reusable functions.

Implementation: Database Design

1. Choice of a database:
a) Less than 4 users
If less than 4 users are going to connect to Automation Database then MS Access is a good choice but it may crash if more than 4 users try to connect to the database at the same time.

b) 4 or More than 4 users:
SQL Server or Oracle can be used to handle multiple connections at the same time without degradation in the performance.

2. Schema Design:

The following activities comprise schema design:

a) Base Tables:
Tables contain Object Names/Object types/Test Data for each screen in the application. The first row of all the base tables always contains Object Names/Object Types.

b) Logical Views
We can design views by joining different tables according to the need of any particular Test Case.

Scenario 1: Change in Application-Under-Test:
If an object name or an object type is changed by a developer in any of the subsequent releases, then the tester has to open the corresponding base table and make the changes only at one place. The changes made to the base tables will automatically get reflected in all the views using the base table and all the scripts will work in exactly the same way they worked in previous releases.

Scenario 2: Change in the Test Cases or flow
A view corresponds to one particular test case and if the test cases are changed then the tester has to just modify the corresponding view and need not to worry about the scripts. Whatever is changed or modified in the view will get reflected in all the joined base tables. This reduces the effort in case a particular step in a test case is being modified or added.

c) DSN (DATA Source Names):
To connect to database, ‘User DSN’ is needed on the machine where script is going to be executed.

Creation of DSN requires three parameters:
Type of the Database
DSN Name
Database Path

Advantages of ASIF

1. Normalization: Designing the database schema with normalization will eliminate the chances of inconsistency and redundancy to a large extent. We can reduce the effort by using the advantages of a RDBMS and can reduce the number of connections in any script. It uses the concept of “Views”, which is nothing but logical connection of tables. In this approach, common sets of tables (database) are going to be used by all the users instead of creating their individual data pools.

2. Application & Script independent:
We must strive to develop a single framework that should grow and continuously improve with future projects that challenge us. This approach makes the test script completely independent of the application. Script creation becomes a one-time effort and reduces maintenance effort at the time when changes are being made.

Fortunately, this heavy initial investment is mostly a one-shot deal. Once in place, ASIF is arguably the easiest of the existing automation frameworks. It provides a very great potential for a long-term success.

Scenario 1: Let us assume that 1000 test cases have the same precondition to login to the application. If the object name for Login is changed then the users need not to touch 1000 scripts but using ASIF they just need to change the object name in the single base table containing information about Login screen and the same will get reflected in all the 1000 scripts.

3. Data storage and Retrieval efficiency
Database can handle large amount of data in an efficient way and retrieval of the data is much faster. In case of changes, data and object names need to change only at one place that makes this approach very attractive. It doesn’t require connecting to more than one data source for the same script.

4. Views & Queries
Views can be created from base tables and each view corresponds to one test case. This makes script very simple to understand as it is fetching data only from one source and that is customized as per the test case. View doesn’t consume space as well as it is just a logical table. In case of any change, views alone will be updated and the entire base tables associated with the views will automatically get updated and vice versa.
Scenario 1: Consider that an object in a screen is stored in a single base table and multiple views are accessing the same base table. Scripts refer to the corresponding logical views to get the required object name, object type and test data. In case if object type is changed then the tester needs to just change that the details of the object, in the corresponding base table and the same will get reflected in all the views and the scripts as well.

5. Logical view and flow of the application
ASIF provides the user with a logical view of the overall application, which is easier for new resources, reviewers and the most importantly the clients.

It requires the design of the database schema, which is a time consuming process and needs good understanding of the database concepts. This can be used for various builds with new change requests. Once it has been designed then the process of making new changes is not very difficult and time consuming thereby making it very useful for the long run.

6. 100 % Reusable Library: ASIF provides the ready-to-use reusable library which can be used in any automation project.

7.Modularity: Scripts are broken down into modules that are independent of each other so that the same module can be called for multiple scripts and also make the debugging of the scripts very easy.
This insulates the application from modifications in the component and provides modularity in the application design. The test script modularity applies the principle of abstraction or encapsulation in order to improve the maintainability and scalability of automated test suites.


ASIF requires relatively more effort for the first time but it will reduce the maintenance time greatly for subsequent releases. This is ideal especially when the development team follows Iterative / Agile / SCRUM / RUP and the regression test case needs to be automated and executed for each and every build to ensure that the functionality is working fine.


Rational Unified Process

Test Automation Frameworks