Using the Split method in C#

Abstract Wall

I’ve been playing with a little C# program to transfer marks into a spreadsheet. It is not a very clever program,  and probably took me longer to write than it would have taken me to actually type in the numbers, but it was much more fun, and next year I’ll be ahead of the game…

Anyhoo, one of the things I had to do was split up full name into set of strings to get the first name and surname. I used the Split method to do this, which is supplied free with the String class.

Split is wonderful. It does exactly what you want, plus a little bit more, and so I thought it would be worth a blog post. Split looks a bit scary, because it returns an array of strings, which is something you might not be used to.  The other thing that you might not like about Split is that you have to give it an array of separators, which looks a bit of a pain but actually gives you a lot of flexibility. My first string was a bunch of names separated by tab characters. Easy.

string sampleName = "Rob\tMiles";
char[] tabSep = new char[] { '\t' };
string [] allNames = sampleName.Split(tabSep);

The code above would make an array called allNames which holds two string elements, one with "Rob" in it and one with "Miles" in it. To handle the fact that some people have lots of first names I can get the surname (which is the last string in the sample name) by using the Length property of the allNames array:

string surname = allNames[allNames.Length-1];

Remember that this code will not end well if the starting string is empty, as this means that the allNames array will not contain any elements and your code will say a big hello to the ArrayBoundsException shortly afterwards running the above.  My perfect solution checks the Length of the allNames array and displays an error if this is zero.

The second problem I had was to do the same split action on a list of names which were separated by either comma or space (or both). First thing I had to do was create an array of delimiters:

char[] wordSep = new char[] { ',', ' ' };

Now words are split on these two separators. Bad news is that if I give the Split method a string that contains a bunch of both kinds:

string sampleName = "rob,    Miles";

- this means that I get a whole array of  strings (in this case 6) lots of which are empty. Just finding the surname (which should be the last string) would still be easy enough but finding all the proper names in the rest of the array would be a pain. Turns out that Split has got this covered though, you can add an option that controls the behaviour of the split so that it ignores the empty items:

string[] allNames = fullName.Split(wordSep, 
                      StringSplitOptions.RemoveEmptyEntries);

The option is a bit of a mouthful, but as you don’t need to say it this is not really a problem. The great thing though is that all those nasty extra empty strings are removed, leaving allNames just holding the two strings I really want. Split is actually very useful and it will work on really big strings. I used it to split a list of several thousand words and it worked a treat.