So very recently I was approached by a team member to write some unit tests around a stored procedure that was being modified.I have never done any unit tests against for a Stored Proc so I was intrigued. I wasn’t really sure where I was going when I started but I opted to write a small console app in C# to proof it out.
For the sake of discussion, let’s say the stored procedure returns back a record-set with a single record containing five columns which describe time-zone information for a given US zip code and takes a single parameter, the zip code. I really couldn’t ask for a better test subject for a first pass, not too big, not too complicated.
My first inclination was to approach it like unit testing any other unit of code. So I figured I would need some variety of data store for my expected values and stored proc inputs. At this point it occurred to me, “Hey, I have a database full of inputs and expected results right here”, so I crafted a quick query to get my expected info. From that, I call the stored proc for each zip and verified the results. Worked like a champ. However, I was unaware the sp code refactoring had already been done on the database I was working against. It was around this point that I started getting a sinking feeling that something wasn’t right with the test. I attributed it to not being sure what to expect as this was a new testing scenario for me (typically the database developers are in charge of their development and testing of their code, while I can write SQL with the best of ‘em, a DBA I am not) and moved on.
Well ok, I have the behavior of the new code, the Actual results in testing terms. Now I can simply aim my program at the old database with the old code-base and get the expected results and modify my app to compare them. And then I discovered that the underlying data structure for the old code was different. The old way, the SP selects the five columns I need by joining three tables and then limiting on the passed in zip code. The new way, a nightly job is scheduled to populate a new “Time Zone” table and the SP simply selects all columns from the new table limiting by zip code.
So finally, it hits me… My test as it stands really is only testing that I get a result for every zip code I ask for and that it selected the same row that I got my expected result from. And maybe, that’s exactly what it should be testing, it does add some value… And yet it feels empty.
What I ended up doing was to retrieve the master list of zip codes from the old way’s table and ran both the new and old SP for each zip code and compared the result. This test felt better, the underlying data structures driving the stored proc could change and the test would be isolated and continue to work as long as the SP’s calling signature stayed the same.
The rub is, I could not have written my new test app without the new SP being developed. So what would I have been testing in it’s first iteration had I wrote tests for it at the time? I would have written my original test that would ultimately grab the expected data from the same database that the stored proc would.
What I should have done was adapt my initial test to work the same way with the old data and verified the stored proc returned the right data in my five columns. Then I should add a a test to verify the data in the new table matches the data from the old data structure. (This test typically would be the responsibility of the db job developers). Then I should modify my test to get it’s expected data from the new (and tested) structures and continue to verify the SP results to the data.
When unit testing code, you tend to have to “mock” out the data calls to control your tests, so from having to avoid all the database calls, it’s a “context-switch” to be requiring the database for everything. But the constant in both is testing the discrete parts from the bottom up. In retrospect, I could have done the test completely with SQL and suddenly the MS “Database Unit Test” makes a lot more sense.