Monday, September 22, 2008

A Simple Task Queue

I’ve spent the last 4 posts talking about Regular Expressions and some difficult patterns. But, this is a C# blog, so I really want to be talking about C#. Today, I hope to provide you with a nice little start on a multi-threading "Task Queue" application. A Task Queue will place task requests in a queue that will be serviced Asynchronously and in the order received.

Rather than keep you in suspense, here’s the code up front. If you'd like an explanation, I've attempted that below. (Update 10/6/2008: Sorry folks about the bug below. The Enqueue method must set the the Busy field to true when queing the first task in order to avoid the thread race. It's now been fixed.)


using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;
using System.Threading;

namespace TaskQueuePOC
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}

int counter = 0;
private void button1_Click(object sender, EventArgs e)
{
new MyTaskQueue(counter++).Enqueue();
}
}

public class MyTaskQueue : TaskQueue
{
public MyTaskQueue(object UserData)
: base(UserData)
{
}
protected override void Task()
{
Thread.Sleep(2000);
Console.WriteLine(UserData.ToString());
}
}

public abstract class TaskQueue
{
public object UserData { get; private set; }
public TaskDelegate TaskDlgt { get; private set; }
protected abstract void Task();

public void Enqueue()
{
TaskDlgt = new TaskDelegate(Task);
lock(lockObject)
{
if(Busy)
_q.Enqueue(TaskDlgt);
else {
Busy = true;
TaskDlgt.BeginInvoke(new AsyncCallback(this.TaskCallback), TaskDlgt);
}
}
}

private static Queue _q = new Queue();
private static bool Busy = false;
private static object lockObject = new object();

public TaskQueue(object Data)
{
UserData = Data;
}

public delegate void TaskDelegate();

private void TaskCallback(IAsyncResult ar)
{
TaskDelegate dlgt = ar.AsyncState as TaskDelegate;
if(dlgt.Equals(TaskDlgt))
dlgt.EndInvoke(ar);
NextTask();
}
private void NextTask()
{
TaskDelegate dlgt;
lock(lockObject)
{
if(_q.Count > 0)
{
dlgt = _q.Dequeue();
dlgt.BeginInvoke(TaskCallback,dlgt);
}
else
Busy = false;
}
}
}
}


I've titled the article "A Simple Task Queue", but simple is a little mis-leading. That's because it's difficult to tell by looking (for the beginner) where the threads are or even how it works. The key to understanding this implementation is to understand the "delegate BeginInvoke" call. I've covered that in other articles on threading, in particular, I refer you to "Threading with .NET ThreadPool Part 4" for a deeper discussion.

Overall, the work horse of this application is the Abstract Class TaskQueue. The class implements all that's needed to queue tasks and execute them in order. All the user need supply is a derived class that overrides "Task()" and a Constructor that passes in some UserData. The application would instantiate a new derived TaskQueue object with the data needed for the task and then call the Enqueue() method.

The TaskQueue works by creating 3 private static control members. There is the queue which holds delegates to run. There's also a Busy member to say when there is an active thread running. There is also a LockObject used internally to synchronize thread access to the the queue and the Busy indicator. This is needed because as one thread completes and tries to update the Busy indicator or take a new delegate off of the queue for execution, the "producer" thread (our UI in this case) may be trying to enqueue another delegate. Since two threads could be accessing these variables simultaneously, we synchronize them with a Mutex on the LockObject.

The multi-threading comes into play when a delegate's BeginInvoke(...) method is called. This method will allocate a thread from the ThreadPool and execute the Task in that thread. BeginInvoke is provided a TaskCallback() function and some state information. In this case, the state information is a reference to the delegate.

The Callback function is responsible for checking the queue and launching the next Task or setting Busy to false.

The magic in all of this is that the delegate is a reference to a specific instance of a TaskQueue object's Task() method. That way, the delegate's Task() method has access to local information about that specific task. So, each delegate will operate with its own version of UserData. Notice, this is also the reason that NextTask() dequeue's a delegate and places the reference in a local variable rather than the TaskDlgt member. TaskDlgt is a reference to one's self, while the delegate taken from the queue is a different delegate. So, each task's completion callback is responsible to start the next task (if any).

Also, notice that the Tasks are highly encapsulated. Once the task is on the queue, all that is available is the delegate. The rest of the task is somewhat hidden, though reflection can be used to make tweaks to the tasks and task data if necessary.

There are many improvements that can be made. Most notably, one could add a way to stop the queue and cancel remaining tasks. One could add controls to their derived class and overridden Task() member to allow individual task control or gross control over the entire queue.

There you have it, a "simple" Task Queue for your amusement and edification. Hopefully, you find it as useful as I have. Also, I have written several other articles on threads and threading. Please feel free to poke around my archives, you may find these articles useful as well.

10/6/2008: Recently, a kind reader pointed out a bug in my code (since fixed) where "Busy" was not being set to "true" anywhere. The obvious behavior was that all tasks ran immediately, no queing was done at all.

Saturday, September 20, 2008

Regular Expression Double Negatives

In my last article titled "Regular Expression Alternation" we discussed Alternation as one means of performing the AND operation in a regular expression pattern.

(?(expression)yes|no)

But, what if you don't have this pattern in your version of RegEx? Some of you may be reading to gain knowledge that you can apply in other programming languages besides C# or the .NET environment. The pattern I described in the last article does not work in all versions of .NET Regular Expressions. In particular, it does not work in JScript Regular Expression Syntax. And, when you use tools like ASP.NET’s RegularExpressionValidator, you are implicitly using JScript. So then, what do you do?

To restate the original requirement, we want to validate a string that has at least one digit, at least one uppercase letter, and at least 6 characters. The problem statement is clearly an AND problem, so it begs to be written with the Alternation pattern. But in situations where we can’t use the Alternation, we can opt for my second approach. We can use the Negative Look-Ahead pattern.

(?!expression)

The negative look-ahead pattern is “non-consuming” and matches strings (or parts of string) that are not followed with the “expression”. You may wish to brush up on negative look-ahead with my article “Regular Expressions in C# - Negative Look-Ahead”. With negative look-ahead and using the OR pattern, we can use double-negative logic to express the original problem. This approach works based upon the logic axiom…

( A AND B ) == NOT ( (NOT A) OR (NOT B) )

Using this rule, the problem can be stated differently. We can state our problem in terms of ORs instead of ANDs. In other words...


If it is not the case that our string is devoid of digits or devoid of uppercase letters or short of 6 characters, then it is a valid string.


Notice I chose terms that suggest negative tests. Take a moment and think about it.

IF ( NOT ( NOT(Has Digit) OR NOT(Has Uppercase) OR NOT(6 or more chars) ) )

This has the same logic result as…

IF ( (Has Digit) AND (Has Uppercase) AND (6 or more chars) )

If you can’t get your mind around this basic programming concept, then re-writing your pattern requirement in terms of ORs will be difficult. So, experiment on paper and with generic logic variables to prove to yourself that your re-written pattern logic means the same thing as the original.

Now let’s look at the actual Regular Expression pattern. The component patterns from which we build look like this...

(^[^\d]*$) string is devoid of digits
(^[^A-Z]*$) string is devoid of uppercase letters
(^.{0,5}$) string has 5 or fewer characters

String them together to get the inner test ...

(^[^\d]*$)|(^[^A-Z]*$)|(^.{0,5}$)

Sometimes you can pause here and simply test for your string to match. If you supply an invalid string, it should MATCH. If you provide a valid string, it should NOT match. So, at this point, we have the opposite of what we really want. (A validator may not allow you to test this way, you would have to write a C# program to use as a “test jig”). So, we now negate the inner pattern with the negative look-ahead syntax...

(?!(^[^\d]*$)|(^[^A-Z]*$)|(^.{0,5}$))

This would work for the IsMatch() approach, but it won't quite work for the Match() approach. We have to add some bounds...

^.*(?!(^[^\d]*$)|(^[^A-Z]*$)|(^.{0,5}$)).*$

This pattern is much harder to decipher than the AND pattern using Alternation. But now, armed with the knowledge from these last two articles, you should be able to discern the meaning of such patterns. We should be able to plug this pattern in to all of the algorithms we covered in previous article and have it validate strings according to the requirement.

That’s all for now, I hope you enjoyed the discussion and find these examples useful.

Sunday, September 14, 2008

Regular Expression Alternations

I promised in my last article that I'd have more to say about regular expression. So here you have it. Today, we look at another tough problem for Regular Expressions. Let’s consider writing a pattern to validate that a string contains at least one digit, at least one uppercase letter, and at least 6 characters. One would think that this would be easy. But here we’re faced with an AND situation and the regular expression syntax doesn't provide an AND operator. Take, for example, any validation problem that has the form…

A and B and C

Without an AND operation in Regex, you are almost forced to go outside of the pattern and implement the test in multiple patterns and multiple passes of your validator. That may be the best approach, or it may be impossible if you are working with a blackbox validator and must provide a single Regex pattern.

I've found two approaches that can be applied to solve such problems. We'll discuss one of those approaches today. The approach is to use the “Alternation” pattern. MSDN documentation lists 3 Alternations, the simple OR, or vertical bar, the "expression" and the "name" alternation. Our problem requires the "expression" version.

(?(expression)yes|no)

This pattern is used like an IF-THEN-ELSE programming pattern. In the IF-THEN-ELSE pattern, the THEN portion can function like an AND. We could rewrite our test above as…

IF A THEN IF B THEN IF C THEN MATCH

But the alternation pattern isn’t as flexible as a programming language like C# or VB.Net. We can’t easily drop the ELSE as we did above. So, our pseudo code above would have to take the form…

IF A THEN
IF B THEN
IF C THEN
MATCH
ELSE
FAIL MATCH
ELSE
FAIL MATCH
ELSE
FAIL MATCH

The psuedo regex pattern using our original requirements looks something like this...

(? (does string have an uppercase letter)
( ?(does string have a digit)
(? does string have at least 6 characters)
(match the whole string)
| (fail the match))
| (fail the match))
| ( fail the match))

This should look very similar to the IF-THEN-ELSE statement above. All we have left now is to write the individual patterns. We need a pattern for each test, a pattern to match the whole string and a pattern that will never match any string.

(.*[A-Z].*) matches if there's at least one upper case letter in the string
(.*[0-9].*) matches if there's at lease one digit in the string
(.{6,}) matches if there are 6 or more characters in the string
(.*) matches the entire string
([^\W\w]) won't match anything

Notice how each test is written in such a way that the whole string is selected. That way all tests are operating on the exact same input. Now piece it together and it looks like this...

(?(.*[A-Z].*)(?(.*[0-9].*)(?(.{6,})(.*)|([^\W\w]))|([^\W\w]))|([^\W\w]))

It's not very pretty, but it's conceptually straight forward. It follows the pseudo-regex precisely. See if you can format it nicely like the pseudo-pattern above.

The final step is to verify our pattern with some code. We’ll use C# for that. Just create a new Windows Forms Application project in C#. On the form, add a reference to the Regular Expression name space.

using System.Text.RegularExpressions;

Next drop a Label and a Textbox on the form. Clear the text in label1. Then double click the textbox and insert into the textBox1_TextChanged handler the following code…

string pattern = @"(?(.*[A-Z].*)(?(.*[0-9].*)(?(.{6,})(.*)|([^\W\w]))|([^\W\w]))|([^\W\w]))";
if(Regex.IsMatch(textBox1.Text,pattern))
{
label1.Text = "Valid";
label1.ForeColor = Color.Black;
}
else
{
label1.Text = "Invalid Entry";
label1.ForeColor = Color.Red;
}


The program will display “Valid” or “Invalid Entry” depending upon the input in the text box. This is a fairly simple password validator, but now you have the tools to expand upon it if you like. There is one caveat though. This pattern will not work in the ASP.Net RegularExpressionValidator component. In my next article on Regular Expressions, I will explain why. I will also explain my second approach mentioned above which will work in the RegularExpressionValidator situation.

Friday, September 12, 2008

Regular Expressions in C# - Negative Look-ahead

In my last article on Regular Expressions, we looked at a couple of simple expressions and 3 algorithms to use the expressions for “validating” strings. The main point of the article is that Regular Expression behavior will be confusing if not considered in the context of the algorithm being used. Please take a look at Regular Expressions. What’s that got to do with C#? to get the background for this article.

In this article, we will discuss one of the more difficult Regular Expression problems. How do you use “Negative Look-ahead”? To use Regular Expression “Negative Look-Ahead” or “Negative Look-Behind”, you have to change the way you think about pattern matching. First, the negative look-ahead takes the syntax…

“(?!pattern)”

In the words of the MSDN documentation…

(Zero-width negative lookahead assertion.) Continues match only if the subexpression does not match at this position on the right. For example, \b(?!un)\w+\b matches words that do not begin with un.


This expression will “find” parts of the supplied string that are not followed by the “pattern”. One might erroneously think that this pattern will refuse to match strings that have the “pattern” in them. It does not, in fact, the following example actually “finds” some strange results.

string pattern = @"(?!invalid)"; //negative lookahead
string test = "invalid";
Regex rx = new Regex(pattern);
Console.WriteLine("Match:\t\t\t{0}\t({1})", rx.Match(test).Success.ToString(), test);
Match mx = rx.Match(test);
while (mx.Success)
{
foreach (Group g in mx.Groups)
{
Console.WriteLine("\t\t\t\t({0}) ({1}): {2}", mx.Value, g.Value, g.Index);
}
mx = mx.NextMatch();
}

The result is that there are actually 7 matches in the test string. The confusing matter is that each of the matches is an empty string, and the index for each match increments, from 1 to the number of characters in the test string. To understand this behavior, let’s think about the pattern differently. Let’s break it apart into some simple components. You could consider the pattern to be equivalent to…

“” + “(?!invalid)” + “”

This is a pattern to match an empty string, followed by a pattern to reject the string “invalid”, followed by a pattern to match an empty string. In short, without the negative look-ahead, this is a pattern to match an empty string. If you were to use just the simple empty string pattern on the test string, there would be 8 matches with indices for each match incrementing from 0 to the number of characters in the test string. There are 7 matches with the negative look-ahead and 8 matches without. The indices start at 1 with the negative look-ahead, and they start at 0 without. So, the negative look-ahead is causing one of the potential matches to fail. It’s rejecting the empty string match that occurs just prior to the first character in the test string. I.e. it is rejecting the only empty string match that is followed by “invalid” and matching all the others.

Oh my head is spinning! Why did I eliminate the negative look-ahead above? I did so to understand its effect on the entire pattern matching process. And I learned that the negative is actually finding a pattern to eliminate. In other words, it’s doing its job. It is eliminating from the set of matches the one match that is followed by the string “invalid”. But, because there are many other “empty-string” matches, IsMatch() returns “true” and Match() returns 7 matches. I also eliminated the negative look-ahead because look-around” patterns do not “consume” characters. They will never show up in the match result, so by temporarily removing the non-consuming pattern, I can see the real pattern that the look-around pattern constraints.

The Regular Expression philosophy is to find things, not to eliminate things. Since Regex works so hard at finding matches, it becomes difficult to write patterns whose job is to exclude strings. One could just write the pattern to find the strings that are unwanted and negate the match in program logic. Negative matching syntax is available, but Regex treats such syntax as a means of reducing the number of matches while trying to find ANYTHING that would otherwise match.

So how do I make this work? Consider the requirement to match only those strings that do not have the sequence “invalid” in it. First, you must define a pattern to match any otherwise valid string. You must define the pattern in such a way that it will match the string using the most restrictive form of the 3 programming approaches described in my last article. In particular, your pattern must pass the following test…

Match mx = Regex.Match(test, pattern);
if(mx.Success && (mx.Groups[0].Value == test))

The test above makes sure that not only does the test string “have” a match but the test string “is” the match. The empty string pattern will find many matches, but the whole match will not equal the test string for any test string other than an empty string. You have to watch out for the asterisk (*) and plus (+) which will consume as much as they can, or as little as necessary to achieve a match. Such patterns will work in the logic above, but they can change behavior as you start adding look-around patterns. In the case of our requirement, the all encompassing pattern would be “^.*”. Leave off the “$” because we will be using look-ahead. When looking ahead, we do not want to anchor the end of the string unless absolutely necessary.

Next, define a pattern to find whole strings that you’d like to exclude. “.*invalid.*” works. This pattern matches any string containing the sequence “invalid”. Next, wrap this pattern with the negative look-ahead syntax “(?.*invalid.*)”. And finally, insert it into the first pattern after the first anchor (^ in our case). Our resulting pattern would be “^(?!.*invalid.*).*”. Use the following test jig to prove this pattern to yourself.

string pattern = @"^(?!.*invalid.*).*";//negative look-ahead
string[] tests = {
"invalid",
"",
"this is also invalid",
"but this is okay",
"but this invalid string is not",
};

Regex rx = new Regex(pattern);
foreach (string test in tests)
{
Console.WriteLine("Match:\t\t\t{0}\t({1})", rx.Match(test).Success.ToString(), test);
Match mx = rx.Match(test);
while (mx.Success)
{
foreach (Group g in mx.Groups)
{
Console.WriteLine("\t\t\t\t({0}) ({1}): {2}", mx.Value, g.Value, g.Index);
}
mx = mx.NextMatch();
}
}

I’ll stop here for today and let you digest what’s going on. Certainly, the pattern can be optimized and some characters reduced. Experiment with changes and observe the effects. I am not through with the subject of Regular Expressions and Negative type matching. Check back later as I hope to post on the subject again soon.

Friday, September 5, 2008

Regular Expressions? What's That Got To Do With C#?

...Only that I often need to know Regular Expressions for my C# work. However, the online help and resources seem to come up a little short. So, today I diverge a little and discuss this cryptic yet valuable ancillary topic to try to help you through your next Regex dilema.

I'm not going to waste time and internet bandwidth explaining what a Regular Expression is, there are plenty of sites for those. But, I will give special thanks here to OmegaMan who compiled and posted the following on the MSDN Regex Forum...

OmegaMan's .Net Regex Resources Reference

I refer to it often.

So let's dive right in. Whenever you are considering using regular expressions, you need to determine what kind of pattern matching problem are you trying to solve.

1) Do I want a regular expression to check a string for validity?
2) Do I want a regular expression to find certain things in my string?
3) Do I want a regular expression so that I can replace patterns in my string?

There can be some overlap in #1 and #2 since validity may depend upon the string containing "certain things". Certainly, if you want to replace something in #3, you need to find it in the string with #2. But, many of the problems people run into with regular expressions can be traced to not having identified the problem properly.

Take the problem, "make sure my string contains only letters and digits". Sounds simple enough, and one may write the pattern ...


"[\da-zA-Z]*"
\d = digit
a-z = lowercase letter
A-Z = uppercase letter
[]* = zero or more of the characters in the brackets


This pattern might "find" letters and digits in a string, but it doesn't say that there are ONLY letters and digits in that string. Regex.IsMatch(...) tests a string with a pattern and returns true if the string contains a match. Regex.Match(...) tests a string with a pattern and returns a Match object indicating if the string contains matches and details, if any, about each match.

So if you are doing a validity check with the pattern above, your results will depend on which tools you use and how you use them. Given that pattern, both IsMatch() and Match() will find matches in a string, even if it contains undesirable characters. In fact, because of the asterisk (*), the string doesn't have to contain any of the pattern characters for there to be "a match" (it matches the empty string). These functions, given this pattern, are simply indicating whether or not any part of the string matches the pattern. Here's some code to demonstrate...

Example 1:

string pattern = @"[\da-zA-Z]*"; //use the @ to tell c# to leave \ alone
string[] tests = {
"containsOnlyLettersAnd01234",
"contains letters And 01234, but also spaces",
"!@#$%", // contains none of the desired characters
"", // a completely empty string
};
foreach(string test1 in tests)
Console.WriteLine("IsMatch:\t{0}\t({1})",Regex.IsMatch(test1,pattern).ToString(),test1);
foreach(string test2 in tests)
Console.WriteLine("Match:\t{0}\t({1})",Regex.Match(test2,pattern).Success.ToString(), test2);

...outputting...

IsMatch: True (containsOnlyLettersAnd01234)
IsMatch: True (contains letters And 01234, but also spaces)
IsMatch: True (!@#$%)
IsMatch: True ()
Match: True (containsOnlyLettersAnd01234)
Match: True (contains letters And 01234, but also spaces)
Match: True (!@#$%)
Match: True ()


Obviously, several of these strings are not valid by our requirements. So, what went wrong, and how would you "validate" the string? In order to do the desired validity check, one must consider both the pattern and the Regex method that will be used. For instance, the pattern as written will validate if you write the supporting code to accommodate it. For example...

Example 2:

Regex rx = new Regex(pattern);
foreach(string test1 in tests)
Console.WriteLine("Modified Match:\t{0}\t({1})",
(rx.Match(test1).Success && (rx.Match(test1).Value == test1)).ToString(),
test1);

...outputting...

Modified Match: True (containsOnlyLettersAnd01234)
Modified Match: False (contains letters And 01234, but also spaces)
Modified Match: False (!@#$%)
Modified Match: True ()


Now that looks much better. Our pattern match now "works" except for the empty string. The requirement might be considered vague and allow for such a match. Many applications will have an input text box that starts out blank. When the user enters characters, the text is then validated. These apps usually explicitly test for empty text. Case in point, the ASP.NET RegularExpressionValidator states that it will not validate the empty string, i.e., empty strings will PASS. It is up to the programmer to require some input. By the way, RegularExpressionValidator does pattern matching validation on both the client and the server. On the client, it uses JScript Regular Expression syntax, which has a smaller feature set and syntax than the server uses. It also uses the same program construction as the second example.


if(match != null && (match[0] == value)) // valid


If you use other tools, you must know how the validity test is done. For, in the last example, patterns that have look-ahead or look-behind will often fail. They will look perfectly valid, and they will match using IsMatch(), but they require that you add some additional pattern to consume those characters that look-around does not consume. We'll get into that in a future article.

Now the pattern could have been written differently. Using the next pattern example, you can use any of the three methods to validate that the string contians only letters and digits. This time, I also require that the input string not be blank.

"^[\da-zA-Z]+$"
^ = match the beginning of string/line, zero width pattern
$ = match the end of the string/line, zero width pattern
[]+ = 1 or more of the characters in the brackets

Example 3:

string pattern2 = @"^[\da-zA-Z]+$";
Regex rx = new Regex(pattern2);
foreach (string test in tests)
{
Console.WriteLine("Modified Match:\t{0}\t({1})",
(rx.Match(test).Success && (rx.Match(test).Value == test)).ToString(),
test);
Console.WriteLine("IsMatch:\t\t{0}\t({1})", rx.IsMatch(test).ToString(), test);
Console.WriteLine("Match:\t\t\t{0}\t({1})", rx.Match(test).Success.ToString(), test);
}

...outputting...

Modified Match: True (containsOnlyLettersAnd01234)
IsMatch: True (containsOnlyLettersAnd01234)
Match: True (containsOnlyLettersAnd01234)
Modified Match: False (contains letters And 01234, but also spaces)
IsMatch: False (contains letters And 01234, but also spaces)
Match: False (contains letters And 01234, but also spaces)
Modified Match: False (!@#$%)
IsMatch: False (!@#$%)
Match: False (!@#$%)
Modified Match: False ()
IsMatch: False ()
Match: False ()


In otherwords, the pattern and the approach must work together. When you have control of both, then solving the problem is easier. But as you can see in the last example, all 3 approaches can validate according to our requirements by tweaking the pattern. The trick is often to find the tweak that works in all cases. When you can't change the underlying programming, like in the case of RegularExpressionValidator, you have to be able to write your pattern to "match" the underlying approach.

Now these are simple examples and I would have liked to get into some real meaty Regex expressions, but I've run out of time, and this posting is late. I'll be back though with more on Regular Expressions in the next article. For now, I hope this is of some use to you.

Monday, September 1, 2008

Enumerations and Strings

It has been a while since my last post. Things get crazy at times. But, here is another bit of sample code for you to muse over. We will look at enumerations and strings.

In C#, an enumeration is NOT a string, nor can you define it to be one. An enumeration can be defined as any of a number of integer types. You can leave it as its default type...


public enum unspecifiedTypeEnum
{
one = 1, two, three,
}


In which case you get named 32 bit values. Or you can specify the type of the underlying value as byte, sbyte, ushort, short, uint, int, ulong or long...


public enum byteTypeEnum : byte
{
one = 1, two, three,
}
// or...
public enum ushortTypeEnum : ushort
{
one = 1, two, three,
}
// etc...


... But, you can't declare it as a string.

That doesn't mean you can't use strings at all. When I want to save data to a file for future reference, I like to make my enumerations human readable and store them in the file as human-readable. It would be a shame to use enumerations to write clear code only to store them as very cryptic values in my maintenance and configuration files. Getting the string representation given the enumeration name is easy. We just use the object.ToString() overloaded method. For enumerations, this returns the name of the value as a string...


public enum Animals { cat, mouse, bird, dog, }
//....

Animals myAnimal = Animals.cat;
Console.WriteLine("CurrentAnimal={0}",myAnimal.ToString());


When I load a configuration value back in to my program, I would want to work with the value as the original enumeration type. That's a little trickier, but with the help of "reflection", it can be done. This little routine shows how...


using System.Reflection;
// ...

Animals FromString(string animal)
{
Type t = typeof(Animals);
FieldInfo[] fi = t.GetFields();

try
{
foreach (FieldInfo f in fi)
{
if (f.Name.Equals(animal))
{
return (Animals)f.GetRawConstantValue();
}
}
}
catch
{
}
throw new Exception("Not an Animal");
}

Actually, it's not so tricky. A reader (thankyou Paul) pointed me to the "static" Enum functions which do much of the hard work for you. The function above can be more simply written as...

Animals FromString(string animal)
{
try
{
return (Animals)Enum.Parse(typeof(Animals), animal);
}
catch (ArgumentException ex)
{
throw new Exception("(" + animal + ")" is not in the Animals enumeration.",ex);
}
}


This gives me an enumeration value for my string, assuming the string matches one of the enumeration names. Enhancements could be made to this code to store and retrieve the fully qualified name. I'll leave that as an exercise for the reader.

You may be wordering about the try and catch above. The GetRawConstantValue() returns a value that we cast to our enumeration type. If that value is invalid for our enumeration, then the cast throws an exception. Also, understand that there are other values in the FieldInfo array besides just enumeration fields. Enumerations have hidden fields that are accessible through reflection. Under contrived circumstances, the caller may pass in the name of one of these hidden fields. The "animal" name will match a field in the FieldInfo array, but the cast with throw an exception as we want it to.

Now, with the ability to convert between string and enumeration, you can write your code so that everything internal is performed on the enumeration type, while persistance and other external representation can be strings. But, what if we want a list of all possible values. We might let the user choose from a list, how might we get that? Here's a way to create a string list for the enumeration...


// also requires
using System.Reflection;
...

IEnumerable AnimalsList()
{
Type t = typeof(Animals);
foreach (FieldInfo f in t.GetFields())
{
try
{
if (f.GetRawConstantValue() is Animals)
;
}
catch
{
continue;
}
yield return f.Name;
}
}

...or simply ...

IEnumerable AnimalList()
{
return Enum.GetNames(typeof(Animals)).ToList();
}


Now in your Form constructor you can write the following to show your enumerations as strings.


listBox1.Items.AddRange(AnimalsList().ToArray());


Someone will invariably want to associate a different string to their enumeration than the name. Maybe they want to obscure the meaning. More likely, the programmer wants to change the enumeration after there are already configuration files in production using older enumeration names. Changing the enumeration could break existing configuration files. But you have to deal with that regardless of how you store the data. I won't go into code examples, but a Dictionary object could be used internally. Another possibility is the string name and the enumeration name may differ by case only, but it should be a trivial modification to deal with that condition, so I leave that also as an exercise for the reader.

Finally, these enumeration functions are perfect for making Generic. They could be used on several enumerations in our code, and we wouldn't want to re-implement it each time. So, I will leave you with the following generic example of Enumerations and Strings.


using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;
using System.Reflection;

namespace EnumExample
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
listBox1.Items.AddRange(EnumList().ToArray());
}

public enum Animals { dog = 1, cat, mouse, bird, }

private void button1_Click(object sender, EventArgs e)
{
Animals myAnimal = Animals.bird;
Console.WriteLine("CurrentAnimal={0}", myAnimal.ToString());

myAnimal = FromString("cat");
try {
myAnimal = FromString("value__");
} catch (Exception ex) {
Console.WriteLine("error: {0}", ex.Message);
}
}

T FromString(string animal) where T : struct
{
Type t = typeof(T);
FieldInfo[] fi = t.GetFields();

try
{
foreach (FieldInfo f in fi)
{
if (f.Name.Equals(animal))
{
return (T)f.GetRawConstantValue();
}
}
}
catch
{
}
throw new Exception("Not a Type " + typeof(T).Name);
}

IEnumerable EnumList()
{
Type t = typeof(T);
foreach (FieldInfo f in t.GetFields())
{
try
{
if (f.GetRawConstantValue() is T)
;
}
catch
{
continue;
}
yield return f.Name;
}
}
}
}