Threading with .NET ThreadPool
The System.Threading.ThreadPool is a great little feature for programmers wishing to add instant threading to their applications. You just enqueue callback functions, what could be more simple than that? I agree, so we'll look now at some simple coding experiments with the ThreadPool. By the way, here's the MSDN version of "How to: Use a Thread Pool" which makes for some good pre-requisit reading. And though I risk repeating much of what may be already elsewhere on the web, I hope that I can add something of use in your estimation.
So first, let's get a basic understanding of thread pooling. Maybe you've already read my article, Thread Synchronized Queing. If so, you have a good start. A thread pool, like my SynchQueue class, will involve some thread(s) writing to a queue, and some set number of threads reading from the queue. Similarly, if there isn't enough work to do, the reader threads wait or block on the queue. And, if there is too much work to do, the queue grows indefinitely or until you throttle your writer threads somehow.
Let's start with a simple Windows Forms application. Later, I'll come back and discuss some of the caveats. I create a new application called ThreadPoolExample. To Form1 I add a couple of methods that I will use as callback functions. One is an instance method and the other a static class method. I also add a new class to Form1.cs called TestClass. TestClass also has a callback function defined.
Now, I add a button to the form and add the following click handler for the button.
With this click handler, I am demonstrating several aspects of starting work with a ThreadPool. In all cases, I start a thread working with the static ThreadPool method QueueUserWorkItem(). This method gives the task to the first available thread, so most of the calls start (or can start) immediately. I say, "start a thread working" rather than "start a thread" because I am doing nothing to start threads. All the threads are started by the underlying framework. It is the framework that controls the maximum number of threads to start and the minimum number of threads to have running in standby. I simply enqueue work to perform and let the framework manage the threads.
The first two calls show the ThreadPool being given an instance method. One call just provides the name of the method, while the other call uses the familiar "new Delegate(function)" construction. I added both of these because I wanted to see if there are any real differences in the two calling methods. I can't see any except I presume the second may not perform as well as the first due to the additional memory allocation. By using the "new" operator, you get the help of Intellisense in figuring out how to write the callback signature. Other than that, I think it's purely up to programmer style. The third line provides a static class method as the callback.
The fourth line demonstrates passing a Lamda Expression to the ThreadPool. Lamda Expressions are new to Visual Studio 2008. Comment it out if you are using 2005 or earlier. The fifth line demonstrates passing an anonymous function to the ThreadPool. Lastly, I use the ThreadPool static method GetMaxThreads() to figure out how many threads the pool will use, and then I exceed that number by 20. The last set of calls also demonstrates passing instance methods of different object instances. This is handy because the instance method has direct access to the object's encapsulated data. In other words, you can implement a class to represent small units of work and hand these off to the ThreadPool by their callback method thus maintaining separate states for each.
You might also note, the ThreadPool.QueueUserWorkItem() is given methods to different classes. The ThreadPool does not care about the type of the underlying data, only the signature of the callback you pass to it. This makes the tool quite versatile. But, similar behavior is gotten from the SynchQueue generic class by declaring it to use a delegate type instead of a data type.
Now, run the code and click the button. You will see in Visual Studio's output window the various strings the threads are supposed to write. Everything works fine. Take a closer look at the output and you will see that things mostly run in the order that they were enqueued. But some of the tasks appear out of order. In fact, the callbacks are assigned a thread in the order that they were enqueued, but there is no guarantee that the threads will finish in the same order as the work was queued. This is fairly intuitive, tasks can take varying amounts of time to complete.
But also consider, though the work is dequeued by threads in FIFO order, it is entirely possible that some tasks will actually start out of order. Picture there being multiple tasks on the queue ready to go, and picture one thread grabbing a task and then immediately being preempted. Another thread grabs the next task and begins work before the first thread comes back.
Thus, we have our first caveat. That is, don't expect the ThreadPool to start your tasks in a fixed order. If you need them started in a precise, fixed order, you will need to synchronize your tasks in some other way.
Like I said earlier, getting multiple threads working in your application is almost trivial with the ThreadPool. But alas, multi-threading is never trivial. There are several aspect of thread pooling that should be addressed if you wish to have an industrial strength application. These are...
1. Enqueuing work.
2. Detecting queue overload and throttling.
3. Detecting work start.
4. Reporting work progress.
5. Detecting work completion.
6. Stopping work-in-progress.
7. Cancelling un-started work.
We've only looked at enqueuing work. But, without the ability to know for sure that the work has been fully completed by the thread, it makes any amount of interdepenancy risky and a clean shutdown nearly impossible. You may have need to end your application or server while tasks are still running. To do this cleanly, you need a way to stop adding tasks to the queue, to cancel any un-started tasks on the ThreadPool queue, and to stop tasks currently being processed by a ThreadPool thread.
I'll look at each of these in my next article. Until then, here is the full source for Form1.cs (you'll need to add your own button and hook up the button click handler yourself).
So first, let's get a basic understanding of thread pooling. Maybe you've already read my article, Thread Synchronized Queing. If so, you have a good start. A thread pool, like my SynchQueue class, will involve some thread(s) writing to a queue, and some set number of threads reading from the queue. Similarly, if there isn't enough work to do, the reader threads wait or block on the queue. And, if there is too much work to do, the queue grows indefinitely or until you throttle your writer threads somehow.
Let's start with a simple Windows Forms application. Later, I'll come back and discuss some of the caveats. I create a new application called ThreadPoolExample. To Form1 I add a couple of methods that I will use as callback functions. One is an instance method and the other a static class method. I also add a new class to Form1.cs called TestClass. TestClass also has a callback function defined.
private void CallBack(object Context)
{
Console.WriteLine("Thread called");
}
private static void SCallBack(object Context)
{
Console.WriteLine("Thread static callback");
}
...
public class TestClass
{
public void CallBack(object context)
{
Console.WriteLine("TestClass #{0}",_mynum);
}
int _mynum;
public TestClass(int n) { _mynum = n; }
}
Now, I add a button to the form and add the following click handler for the button.
private void button1_Click(object sender, EventArgs e)
{
ThreadPool.QueueUserWorkItem(this.CallBack);
ThreadPool.QueueUserWorkItem(new WaitCallback(this.CallBack));
ThreadPool.QueueUserWorkItem(SCallBack);
ThreadPool.QueueUserWorkItem((o) => { Console.WriteLine("lamda callback"); });
ThreadPool.QueueUserWorkItem(delegate(object o) { Console.WriteLine("delegate callback"); });
int max, dummy;
ThreadPool.GetMaxThreads(out max, out dummy);
for (int i = 0; i < max + 20; ++i)
ThreadPool.QueueUserWorkItem(new TestClass(i).CallBack);
}
With this click handler, I am demonstrating several aspects of starting work with a ThreadPool. In all cases, I start a thread working with the static ThreadPool method QueueUserWorkItem(). This method gives the task to the first available thread, so most of the calls start (or can start) immediately. I say, "start a thread working" rather than "start a thread" because I am doing nothing to start threads. All the threads are started by the underlying framework. It is the framework that controls the maximum number of threads to start and the minimum number of threads to have running in standby. I simply enqueue work to perform and let the framework manage the threads.
The first two calls show the ThreadPool being given an instance method. One call just provides the name of the method, while the other call uses the familiar "new Delegate(function)" construction. I added both of these because I wanted to see if there are any real differences in the two calling methods. I can't see any except I presume the second may not perform as well as the first due to the additional memory allocation. By using the "new" operator, you get the help of Intellisense in figuring out how to write the callback signature. Other than that, I think it's purely up to programmer style. The third line provides a static class method as the callback.
The fourth line demonstrates passing a Lamda Expression to the ThreadPool. Lamda Expressions are new to Visual Studio 2008. Comment it out if you are using 2005 or earlier. The fifth line demonstrates passing an anonymous function to the ThreadPool. Lastly, I use the ThreadPool static method GetMaxThreads() to figure out how many threads the pool will use, and then I exceed that number by 20. The last set of calls also demonstrates passing instance methods of different object instances. This is handy because the instance method has direct access to the object's encapsulated data. In other words, you can implement a class to represent small units of work and hand these off to the ThreadPool by their callback method thus maintaining separate states for each.
You might also note, the ThreadPool.QueueUserWorkItem() is given methods to different classes. The ThreadPool does not care about the type of the underlying data, only the signature of the callback you pass to it. This makes the tool quite versatile. But, similar behavior is gotten from the SynchQueue generic class by declaring it to use a delegate type instead of a data type.
Now, run the code and click the button. You will see in Visual Studio's output window the various strings the threads are supposed to write. Everything works fine. Take a closer look at the output and you will see that things mostly run in the order that they were enqueued. But some of the tasks appear out of order. In fact, the callbacks are assigned a thread in the order that they were enqueued, but there is no guarantee that the threads will finish in the same order as the work was queued. This is fairly intuitive, tasks can take varying amounts of time to complete.
But also consider, though the work is dequeued by threads in FIFO order, it is entirely possible that some tasks will actually start out of order. Picture there being multiple tasks on the queue ready to go, and picture one thread grabbing a task and then immediately being preempted. Another thread grabs the next task and begins work before the first thread comes back.
Thus, we have our first caveat. That is, don't expect the ThreadPool to start your tasks in a fixed order. If you need them started in a precise, fixed order, you will need to synchronize your tasks in some other way.
Like I said earlier, getting multiple threads working in your application is almost trivial with the ThreadPool. But alas, multi-threading is never trivial. There are several aspect of thread pooling that should be addressed if you wish to have an industrial strength application. These are...
1. Enqueuing work.
2. Detecting queue overload and throttling.
3. Detecting work start.
4. Reporting work progress.
5. Detecting work completion.
6. Stopping work-in-progress.
7. Cancelling un-started work.
We've only looked at enqueuing work. But, without the ability to know for sure that the work has been fully completed by the thread, it makes any amount of interdepenancy risky and a clean shutdown nearly impossible. You may have need to end your application or server while tasks are still running. To do this cleanly, you need a way to stop adding tasks to the queue, to cancel any un-started tasks on the ThreadPool queue, and to stop tasks currently being processed by a ThreadPool thread.
I'll look at each of these in my next article. Until then, here is the full source for Form1.cs (you'll need to add your own button and hook up the button click handler yourself).
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;
using System.Threading;
namespace ThreadPoolExample
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
private void CallBack(object Context)
{
Console.WriteLine("Thread called");
}
private static void SCallBack(object Context)
{
Console.WriteLine("Thread static callback");
}
private void button1_Click(object sender, EventArgs e)
{
ThreadPool.QueueUserWorkItem(this.CallBack);
ThreadPool.QueueUserWorkItem(new WaitCallback(this.CallBack));
ThreadPool.QueueUserWorkItem(SCallBack);
ThreadPool.QueueUserWorkItem((o) => { Console.WriteLine("lamda callback"); });
ThreadPool.QueueUserWorkItem(delegate(object o) { Console.WriteLine("delegate callback"); });
int max, dummy;
ThreadPool.GetMaxThreads(out max, out dummy);
for (int i = 0; i < max + 20; ++i)
ThreadPool.QueueUserWorkItem(new TestClass(i).CallBack);
}
}
public class TestClass
{
public void CallBack(object context)
{
Console.WriteLine("TestClass #{0}",_mynum);
}
int _mynum;
public TestClass(int n) { _mynum = n; }
}
}
Comments
http://www.packtpub.com/beginners-guide-for-C-sharp-2008-and-2005-threaded-programming/book
It explains how to take advantage of pools of threads with great examples.