I have two Great Plains databases, let's call them DB1A and DB1B, that are schematically identical and the data is 99% identical but DB1B has a tiny bit more data. I'm trying to optimize a very complex query that uses views nested in views and is way too long to post all of here. The query was averaging 1 min 43 sec before I started. I noticed that all of the tables involved were heaps (Microsoft's fault, not mine) so I started converting them to clustered one by one, checking the plans and parallelism after each change. The first weird thing is that after I convert 3 or 4 of the tables the query stops using parallelism even though its new plan can result in much longer query times. I got to the point in DB1B where the query only took 1 second, was not parallel, and had low logical reads and zero physical reads. When I updated the tables in DB1A to match those clustered-wise in DB1B, the query plan was different, still using parallelism, and when executed it took 56 seconds and had high logical reads but zero physical reads. ??
The two databases are on the same instance, and I assume SQL Server will decide on its own which database gets to use more of the cache. Since DB1A is production and DB1B is only for my testing I'm guessing DB1A gets a lot more of the cache. If there are zero physical reads, doesn't that mean the results are coming from the cache?
To add to the confusion, I ran UPDATE STATISTICS WITH FULLSCAN on the primary table in the query in both databases, and not only did that not help the query in DB1A, it caused the query in DB1B to slow down to 2+ minutes and do millions of logical reads, just like in DB1A. Is it possible that SQL Server only used the fast query plan in DB1B because the statistics were out of date?? Do you think the cache is playing a bigger role than the stats? Am I missing something?
Also, when I did the UPDATE STATISTICS WITH FULLSCAN it caused my page life expectancy to take a nosedive (at least I think that was the cause).