I spent much of last weekend running tests against a table with 122,000 records. The folks at Citibank and Geico would just yawn at that amount of data, but where I work that's fairly heavy lifting. As I was tuning and validating the tests, I used a bit-field flag ("TestCompleted") to track whether a record had already been tested. No need to do work twice, eh?
Because this was a one-time set of tests, I used a TableAdapter (the simplest possible code) to grab all the records, then I needed to filter out all the records that had already been processed. My initial solution was to use a DataView; just set the Filter property with a filter expression ("TestCompleted = false"), and voila--instant filtration! Although it wasn't quite instant....my computer required 9 seconds to perform the operation.
For reasons I won't go into here, I had to switch to using an array of DataRecord returned by the DataTable.Select method. The filter expression ("TestCompleted = false") was simply passed as a parameter to the Select method, rather than to the DataView.Filter setter. So I ran the test again, and....whoa!
I had my array of DataRecord in just one second. That's a whole order of magnitude faster than the DataView approach!
Frankly, I don't know why the DataView.Filter approach is so much more computationally expensive than the DataTable.Select approach. And perhaps for small amounts of data, the two approaches would be equally good--or the DataView might even be better and faster. I didn't test the small recordset scenario, so I don't know that either. (Note: I have subsequently posted an analysis of the reason why the DataView is so much slower for those who are interested.)
What I do know is this: if you're filtering a large amount of data in a DataTable, and you care about system performance, you will want to filter your data by using DataTable.Select (to get an array of DataRecord), rather than using DataView.Filter.