Monday, September 3, 2007

DataBase Testing

Database Testing



Why Test an RDBMS?
There are several reasons why you need to develop a comprehensive testing strategy for your RDBMS:
The current state of the art in many organizations is for data professionals to control changes to the database schemas, for developers to visually inspect the database during construction, and to perform some form of formal testing during the test phase at the end of the lifecycle. Unfortunately, none of these approaches prove effective. Application developers will often go around their organization's data management group because they find them too difficult to work with, too slow in the way they work, or sometimes they don't even know they should be working together. The end result is that the teams don't follow the desired data quality procedures and as a result quality suffers.

Uncomfortable Question:
Isn't it time that we stopped talking about data quality and actually started doing something about it?

What to test in an RDBMS.
Black-Box Testing at the Interface
White/Clear-Box Testing Internally Within the Database
O/R mappings (including the meta data)
Incoming data values
Outgoing data values (from queries, stored functions, views ...)
Scaffolding code (e.g. triggers or updateable views) which support refactorings
Typical unit tests for your stored procedures, functions, and triggers
Existence tests for database schema elements (tables, procedures, ...)
View definitions
Referential integrity (RI) rules
Default values for a column
Data invariants for a single column
Data invariants involving several columns

When Should We Test?
Agile software developers take a test-first approach to development where they write a test before you write just enough production code to fulfill that test. The first step is to quickly add a test, basically just enough code to fail. Next you run your tests, often the complete test suite although for sake of speed you may decide to run only a subset, to ensure that the new test does in fact fail. You then update your functional code to make it pass the new tests. The fourth step is to run your tests again. If they fail you need to update your functional code and retest. Once the tests pass the next step is to start over.

How to Test
Although you want to keep your database testing efforts as simple as possible, at first you will discover that you have a fair bit of both learning and set up to do. In this section I discuss the need for various database sandboxes in which people will test: in short, if you want to do database testing then you're going to need test databases (sandboxes) to work in. I then overview how to write a database test and more importantly describe setup strategies for database tests. Finally, I overview several database testing tools which you may want to consider.
Database Sandboxes
A common best practice on agile teams is to ensure that developers have their own "sandboxes" to work in, depicts the various types of sandboxes which your team may choose to work in. In each sandbox you'll have a copy of the database. In the development sandbox you'll experiment, implement new functionality, and refactor existing functionality, validate your changes through testing, and then eventually you'll promote your work once you're happy with it to the project integration sandbox. In this sandbox you will rebuild your system and then run all the tests to ensure you haven't broken anything (if so, then back to the development sandbox). Occasionally, at least once an iteration/cycle, you'll deploy your work to the level (demo and pre-production testing), and every so often (perhaps once every six to twelve months) into production. The primary advantage of sandboxes are that they help to reduce the risk of technical errors adversely affecting a larger group of people than is absolutely necessary at the time.

Writing Database Tests
There's no magic when it comes to writing a database test, you write them just like you would any other type of test. Database tests are typically a three-step process:
1. Setup the test. You need to put your database into a known state before running tests against it. There are several strategies for doing so.
2. Run the test. Using a database regression testing tool, run your database tests just like you would run your application tests.
3. Check the results. You'll need to be able to do "table dumps" to obtain the current values in the database so that you can compare them against the results which you expected.
Setting up Database Tests
To successfully your database you must first know the exact state of the database, and the best way to do that is to simply put the database in a known state before running your test suite. There are two common strategies for doing this:
1. Fresh start. A common practice is to rebuild the database, including both creation of the schema as well as loading of initial test data, for every major test run (e.g. testing that you do in your project integration or pre-production test sandboxes).
2. Data reinitialization. For testing in developer sandboxes, something that you should do every time you rebuild the system, you may want to forgo dropping and rebuilding the database in favor of simply reinitializing the source data. You can do this either by erasing all existing data and then inserting the initial data vales back into the database, or you can simple run updates to reset the data values. The first approach is less risky and may even be faster for large amounts of data.
An important part of writing database tests is the creation of test data. You have several strategies for doing so:
1. Have source test data. You can maintain an external definition of the test data, perhaps in flat files, XML files, or a secondary set of tables. This data would be loaded in from the external source as needed.
2. Test data creation scripts. You develop and maintain scripts, perhaps using data manipulation language (DML) SQL code or simply application source code (e.g. Java or C#), which does the necessary deletions, insertions, and/or updates required to create the test data.
3. Self-contained test cases. Each individual test case puts the database into a known state required for the test.
What Testing Tools Are Available?
I believe that there are several critical features which you need to successfully test RDBMSs.
The testing tools should support the language that you're developing in. For example, for internal database testing if you're a Microsoft SQL Server developer, your T-SQL procedures should likely be tested using some form of T-SQL framework. Similarly, Oracle DBAs should have a PL-SQL-based unit testing framework. Third, you need tools which help you to put your database into a known state, which implies the need not only for test data generation but also for managing that data (like other critical development assets, test data should be under configuration management control).
Table 2. Some database testing tools.
Category
Description
Examples
Unit testing tools
Tools which enable you to regression test your database.
DBUnit
NDbUnit
OUnit for Oracle (being replaced soon by Qute)
SQLUnit
TSQLUnit (for testing T-SQL in MS SQL Server)
Visual Studio Team Edition for Database Professionals includes testing capabilities
Testing tools for load testing
Tools simulate high usage loads on your database, enabling you to determine whether your system's architecture will stand up to your true production needs.
Empirix
Mercury Interactive
RadView
Rational Suite Test Studio
Web Performance
Test Data Generator
Developers need test data against which to validate their systems. Test data generators can be particularly useful when you need large amounts of data, perhaps for stress and load testing.
Data Factory
Datatect
DTM Data Generator
Turbo Data

Who Should Test?
During development cycles, the primary people responsible for doing database testing are application developers and agile DBAs. They will typically pair together, and because they are hopefully taking a TDD-approach to development the implication is that they'll be doing database unit testing on a continuous basis. During the release cycle your testers, if you have any, will be responsible for the final system testing efforts and therefore they will also be doing database testing.
The role of your data management (DM) group, or IT management if your organization has no DM group, should be to support your database testing efforts. They should promote the concept that database testing is important, should help people get the requisite training that they require, and should help obtain database testing tools for your organization. As you have seen, database testing is something that is done continuously by the people on development teams, it isn't something that is done by another group (except of course for system testing efforts). In short, the DM group needs to support database testing efforts and then get out of the way of the people who are actually doing the work.
Beware Coupling:
One danger with database regression testing, and with regression testing in general, is coupling between tests. If you put the database into a known state, then run several tests against that known state before resetting it, then those tests are potentially coupled to one another. Coupling between tests occurs when one test counts on another one to successfully run so as to put the database into a known state for it. Self-contained test cases do not suffer from this problem, although may be potentially slower as a result due to the need for additional initialization steps.

=========================================================================

No comments:

Counts