A Beginner’s Tutorial on String Comparison in C#

This small article talks about the right way of comparing strings in a C# application. We will try to see what are the various ways we can compare the strings and which one should be or should not be used.

Background

Usually in our applications when we want to compare two strings we use the equality operator. Under most scenarios this will work properly but still we should know what are the other ways we can do string comparisons and perhaps achieve better performance and results. So lets say I have a variable strand I want to check whether its value is equal to “Yes” or not.

The above mentioned operator will do the comparison in a case sensitive manner and it will not consider the current culture. Now if a non case sensitive comparison is required I have seen most of the developers taking either of the below mentioned approaches.

Either we do this

or we do something like

Now this will work fine in most cases and since the immutable nature of the string will not even modify my original string, it does involve an extra function call and creation of an extra temporary string variable(call to ToLower() or ToUpper()). And it will not work in case we have this code running in a culture sensitive application and the strvariable might contain some characters that are non-English characters.

So how do we do string comparison in a way that circumvent all these problems. .Net framework string class already takes care of all these scenarios and provide us some functions that will enable us to perform correct and optimal string comparison in all such scenarios. We will now look into these functions.

Note: We will talk about equality comparison but all these points will be valid for other comparisons too i.e. finding the order of strings etc.

Using the Code

The very first thing to understand before jumping on the functions is the type of comparisons I might need. I might need a culture sensitive comparison or a non culture sensitive comparison(Ordinal comparison). secondly, I might want a case sensitive information or case insensitive comparison.

Now let us look at what .Net provides us. .Net provides us 3 Modes.

  1. CultureInvariant
  2. CurrentCulture
  3. Ordinal

CultureInvariant

The CultureInvariantmode assumes that all the comparisons will be done in English language and en-US as the culture. This mode interpret characters with reference to a particular alphabet. The alphabets are ordered assuming the en-US as the culture. This mode ultimately can be visualized as using this sort of string to find the order of string: "AaBbCc...".So in this mode the sting “ CAT” and “ bat” will be ordered as: “ bat”, “ CAT”.

CurrentCulture

The 2nd mode CurrentCulture, this will arrange the alphabets as arranged in case of Invariant culture to find the order of strings, only this order will be culture specific.

Also in this mode, The characters are compared using their corresponding counterpart in the other culture i.e the German Ä will be treated as A of en-US.

Ordinal

The 3rd mode Ordinal simply compares the strings based on the order of characters. In other words, it simply uses the Unicode value of the characters to find the order. It uses following reference string for ordering strings. Which is nothing but all alphabets ordered as per their Unicode/ASCII values: "ABC...abc...". So in this mode the sting “ CAT” and “ bat” will be ordered as: “ CAT”, “ bat”.

Now with this information at hand let us see what .Net provides us. The String.Equals and compare functions have an overloaded version which takes StringComparison enum type as the argument. This argument will specify the mode which we want to use for this comparison.

This enum could have these possible values

  • CurrentCulture
  • CurrentCultureIgnoreCase
  • InvariantCulture
  • InvariantCultureIgnoreCase
  • Ordinal
  • OrdinalIgnoreCase 

Looking at each enum value it is self explanatory which mode is for which scenario. Still Let us draw a small matrix for the same.

CaseSensitive Non Casesensitive  
Culture Sensitive CurrentCulture CurrentCultureIgnoreCase
Non culture sentitive(English en-US) InvariantCulture InvariantCultureIgnoreCase
Order Ordinal OrdinalIgnoreCase

And now I do the same comparison which we saw above using these modes.

Comparing the string character to character in a case sensitive manner.

Comparing the string in a non case sensitive manner.

These code snippets will also give us the desired results and perhaps in a little efficient way than the earlier.

Note: The == operator is equals to StringComparison.Ordinal. So in cases we need to use this mode we can simply do away with the == operator.

Now let us summarize and see which one should be used when

  • CurrentCulture- Culture specific case sensitive comparison.
  • CurrentCultureIgnoreCase- Culture specific case non-sensitive comparison.
  • InvariantCulture- English only case sensitive comparison.
  • InvariantCultureIgnoreCase- English only non-case sensitive comparison.
  • Ordinal- ASCII/UNICODE value based case sensitive comparison.
  • OrdinalIgnoreCase- ASCII/UNICODE value based non-case sensitive comparison.

 

A Note on StringComparer and StringComparison

A very interesting point of confusion is the possibility of being able to user StringComparerclass for all the similar string comparisons. This class also has all these 6 ways doing the string comparisons. Important thing to note here is that this Class also implements comparison interfaces i.e. IComparer, IEqualityComparer, IComparer<String>.

The StringComparisonthat we have discussed so far in this Tip is an enum that you we should use while comparing 2 strings. So when should we not use this above mentioned approach and go for the StringComparerclass.

The thumb rule is that if only string comparison is needed then we should use Stringclass’s methods like String.Equals Which will use the StringComparisonenum to determine which mode should be used for actual comparison. You Should use StringComparerclass only when we have some  methods which take any one of IComparer, IEqualityComparer, IComparer<String> type as a parameters and we need to pass our strings.

Perhaps, internally the Stringclass’s methods are still using StringComparerclass for actual comparison but from a developer’s perspective following the above guideline should suffice.

Point of Interest

This small article is written for those developers who are still in the start of their career and they are manipulating string in various forms just to achieve the desired comparison results. We have discussed only the equality operation but comparison operator will also follow the same rules.

Download sample code for this article: stringComparison