Thursday, February 6, 2014

NUnit Test Data in CSV or Excel

     I'm always trying to find ways to build what I call "lightweight data driven tests" and I keep coming back to Excel or CSV as a way to accomplish this quickly.  Why? Because, from thought to execution you can quickly get a large number of tests completed while keeping them easy to maintain. Data driven testing is much like gold mining, some people spend a lot of money and only find dirt others get rich quick and everything between that.  The best case for this Pattern is when you want a third party to take some degree of ownership of the data, a PM, manual QA, customer service, Technical Account manager, Sales Engineer, or other. The example below will be in CSV in case you don't actually have Excel installed. But that won't matter because Excel and CSV are basically the same in code, Excel is just easier to work with.
     First off here is a sample test "TestCsvData" and a TestCaseSource "CSVDATA" that calls the "TestDataReader" class's "ReadCsvData" method to get its test data.
[TestCaseSource("CSVDATA")]
public void TestCsvData(string a, string b, string c)
{
    Assert.AreEqual(int.Parse(a) + int.Parse(b), int.Parse(c));
}

public IEnumerable<TestCaseData> CSVDATA
{
    get
    {
        List<TestCaseData> testCaseDataList = new TestDataReader().ReadCsvData(@"Data\CSVData.csv");
        if (testCaseDataList != null)
            foreach (TestCaseData testCaseData in testCaseDataList)
                yield return testCaseData;
    }
}

     Next we just need to make an OleDbConnection to the file and select the rows we want to use. For a CSV file the table is the file name and for a Excel file the spreadsheet is the table. The first row is treated as the column names, if you don't want to return all columns as test data then change the select statement .
class TestDataReader
{
    public List<TestCaseData> ReadCsvData(string csvFile, string cmdText = "SELECT * FROM [{0}]")
    {
        if (!File.Exists(csvFile))
            throw new Exception(string.Format("File name: {0}", csvFile), new FileNotFoundException());
        var file = Path.GetFileName(csvFile);
        var pathOnly = Path.GetFullPath(csvFile).Replace(file, "");
        var tableName = string.Format("{0}#{1}", Path.GetFileNameWithoutExtension(file), Path.GetExtension(file).Remove(0,1));

        cmdText = string.Format(cmdText, tableName);
        var connectionStr = string.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=\"Text;HDR=Yes\"", pathOnly);
        var ret = new List<TestCaseData>();
        using (var connection = new OleDbConnection())
        {
            connection.ConnectionString = connectionStr;
            connection.Open();

            var cmd = connection.CreateCommand();
            cmd.CommandText = cmdText;
            var reader = cmd.ExecuteReader();

            if (reader == null)
                throw new Exception(string.Format("No data return from file, file name:{0}", csvFile));
            while (reader.Read())
            {
                var row = new List<string>();
                var feildCnt = reader.FieldCount;
                for (var i = 0; i < feildCnt; i++)
                    row.Add(reader.GetValue(i).ToString());
                ret.Add(new TestCaseData(row.ToArray()));
            }
        }
        return ret;
    }

    public List<TestCaseData> ReadExcelData(string excelFile, string cmdText = "SELECT * FROM [Sheet1$]")
    {
        if (!File.Exists(excelFile))
            throw new Exception(string.Format("File name: {0}", excelFile), new FileNotFoundException());
        string connectionStr = string.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=\"Excel 12.0 Xml;HDR=YES\";", excelFile);
        var ret = new List<TestCaseData>();
        using (var connection = new OleDbConnection(connectionStr))
        {
            connection.Open();
            var command = new OleDbCommand(cmdText, connection);
            var reader = command.ExecuteReader();
            if (reader == null)
                throw new Exception(string.Format("No data return from file, file name:{0}", excelFile));
            while (reader.Read())
            {
                var row = new List<string>();
                var feildCnt = reader.FieldCount;
                for (var i = 0; i < feildCnt; i++)
                    row.Add(reader.GetValue(i).ToString());
                ret.Add(new TestCaseData(row.ToArray()));
            }
        }
        return ret;
    }
}

So why Excel? Its just easier than anything else.... 

  • Excel was built for this type of data manipulation 
  • Excel has lots of functionality to help you get work done quick such as pattern copy
  • Using a file instead of DB makes the test suite more accessible to non-technical people
  • SQL will require a lot more time to develop
  • SQL will require a lot more time to maintain
  • CSV not very good readability or data entry 
Generic methods like these return all the data as string parameters so its up to the test method to correctly type the data.

In the past I have had a hard time convincing people of the benefits of testing this way, I'll do my best here to allay their concerns and yours.
Concern: Putting the test data in Excel or CSV will somehow make the data become irrelevant especially over time. I think they have the perception that having the data in a file external to the test class will cause a data issue or slow down the time to find and fix issues with the data.
Answer: False, the data will become irrelevant no matter where you store it, so focus on changing/fixing it as soon as it happens regardless of data location.
Concern: By putting the test data in Excel or CSV the data will lose visibility, meaning the PM's, managers or manual testers won't have visibility to the data.
Answer: False, we are talking about moving the data from source code to Excel or CSV file and it will most likely be in an adjacent folder. That means the data is going to be in the same source control as before so no access control will change. Furthermore, you might grant write permissions to manual testers on an excel file to add more test cases where you won't want them touching source code.
Concern: Shouldn't we put this in SQL, reasons: (reporting, visibility, easy data entry, etc)
Answer: False, if you put this data in SQL you need to build a data access layer and this will make changing the data more difficult and rigid. Also, if you don't already have one, using SQL now will add a new dependency on SQL.  Rule of thumb SQL is great for enterprise software but not at all lightweight.
Concern: Excel or CSV  will not scale to meet the data needs, or will have poor performance.
Answer: If you have this problem with your test data, you have other more important problems. I believe if you could actually find enough relevant tests to make Excel or CSV perform poorly enough to bother you then you hit a testing gold mind.