Today I walked through someone's code for a basic scheduler ...
The class had a thread which woke up on intervals and looked through a collection of objects that met an interface IScheduledTask ... In general this is a great utility class everyone should have, in my experience you can absolutly kill your thread pool using timers and since it is a centralized place for scheduling it makes for easy diagnostics/metrics. That said lets get into the abstraction.
The scheduling class was killing the CPU because it had ... 250k items in it ... these items had a lifespan of <> 10 seconds. The internal collection was not being stored in a sorted fashion.
OK there are a few things wrong here; the largest is the not storing in a sorted fashion which would have made things alot faster on the first loop (the running of the scheduled task).
foreach(ScheduledTask st in MyScheduledTasks) {
if(st.TimeToRun <= DateTime.Now) {
ItemsFound.Add(st);
}
}
this would obviously be improved by sorting the items thus the loop turns into ...
foreach(ScheduledTask st in MyScheduledTasks) {
if(st.TimeToRun > DateTime.Now) {
break;
} else {
ItemsFound.Add(st);
}
}
this was not however where all of the time was being wasted! The time was being wasted removing the items from the collection.
foreach(ScheduledTask t in ItemsFound) {
t.DoTask();
MyScheduledTasks.Remove(t);
}
This seemingly harmless code is actually doing silly amounts of work. The remove() function is O(n) ! and doing an O(n) operation with a data size <> 200,000 gets QUITE expensive, especially when you are doing it 10000 times on an iteration.
What I would have liked to do here is something similar to the following.
foreach(ScheduledTask st in MyScheduledTasks) {
if(st.TimeToRun > DateTime.Now) {
break;
} else {
ItemsFound.Add(st);
location++;
}
}
then just say something similar to list.Root = list[location]; this would remove the first N items from the list with almost no overhead ...
unfortunately I could not figure out how to do anything similar to this with the arraylist class and instead put in an immediate solution of creating a new arraylist on each iteration with the items that should stay instead of removing the items that need to be deleted (note this is just a temporary solution). The reason that this works is that the add is O(1).
I fear that I will have to create my own data structure implementation in order to handle this case; oh well I guess it has been a while since I did a linked list implementation :-/
This said, I think that the ability to trim a list from the beginning would be extremely useful functionality in an arraylist .. even if I had to specify it as .. SetRoot(index) which in a true linked list would force a traversal to the indexth node in order to set it as the root but a single O(n) operation is still vastly superior to n O(n) operations
Greg