one.world – A Response to Our Issues With C#


Scott Wiltamuth and Brad Abrams from Microsoft kindly provided us with the following comments on our list of issues with C#. Scott's email follows.


Here are answers to the questions you asked. These are a mix of responses from Brad and I. Brad covered the issues with the .NET class library, and I covered the C#-specific issues. If you have follow-up questions, feel free to email us directly.

Q. Arrays are also collections (ICollection) --- however, why aren't they also lists (IList)?
Arrays are also lists in the current build -- System.Array now implements IList.

Q. Out and ref parameters to methods, that is multiple results and pass-by-reference --- however, it's unclear whether they are thread-safe
A call to a member that has an out or ref parameter is not automatically thread-safe. If you need thread safety, you'll need to obtain a lock manually. Whether this means using the lock statement around the call or at some higher-level is a decision that is best left to the developer. The compiler can't do a good job of doing this automatically.

Exceptions
     Q. Why aren't thrown exceptions declared?
     Q. Where is the documentation?

This is a good question, and one that we have carefully considered.

Some languages allow or require member signatures to include exception specifications. E.g., in C++, a method that raises an ArgumentException can be declared with:

    void F(int a) throw (MyException) {...}

and in Java such a method can be defined as

    void F(int a) throws MyException {...}

C# neither requires nor allows such exception specifications. Examination of small programs leads to the conclusion that requiring exception specifications could both enhance developer productivity and enhance code quality, but experience with large software projects suggests a different result -- decreased productivity and little or no increase in code quality.

Requiring exception specifications would decrease developer productivity because of the sheer proliferation of exception specifications. This proliferation proceeds in two dimensions:

  • The number of members. Modern exception handling allows a division of work between the code that raises the exception and the code that handles it. These pieces of code may be separated by intervening code. E.g., A calls B, B calls C, C calls D, and D raises an exception that is eventually handled by A. If C# required exception specifications, then each of A, B, C, and D would have to contain exception-handling related code even though only A and D do any actual work related to the exception.
  • The number of possible exceptions. The number of exceptions is unquestionably large. E.g., any code that adds two numbers could result in an overflow exception, any code that divides two numbers could result in a divide by zero exception, and any code that instantiates an object could result in an out of memory exception.
  • The lack of an increase in code quality is related to the response of developers to the proliferation of exception specifications. Developers who carefully examine all of the exception specification errors reported by the compiler might see an increase in code quality, but this would come at the expense of productivity. On the other hand, some developers will respond to the errors by mindlessly adding whatever exception specifications the compiler requires, and others will choose to subvert the intent of the language designers by adding a generic exception specification to every member. The former option is unlikely to increase code quality. The latter option effectively makes exception specifications optional -- every member is required to have a "throws Exception" clause but this doesn't actually communicate any information or require any thought.

A better strategy is for client code -- code that is using a class library -- to include both generic exception handling and specific exception handling.

Generic exception handling is performed centrally, and generically deals with all exceptions. Specific exception handling checks for a smaller number of exceptions -- the ones that the client code is specifically prepared to respond to or recover from. This split between generic and specific exception handlers is practically required since some exceptions (e.g., out of memory exceptions) can occur in many program locations but are rare in frequency.

All client code needs to have some generic exception handling.

It is interesting to note that neither Java nor C++ is strict about requiring exception specifications:

  • Java provides support for both checked and unchecked exceptions, and (in a "do as I say and not as I do" wrinkle) many of the Java-defined exceptions are unchecked exceptions. A class author is supposed to judge whether the exception can occur "at many points in the program" and whether "recovery would be difficult or impossible" in order to make a decision about whether to employ a checked or unchecked exception. We think that this is an impossible decision to make since these factors are highly dependent on the code that is using the class rather than the class itself.
  • The C++ language standard requires strict exception checking, but this standard is commonly ignored by tools, including Visual C++. Given the proliferation of exception specifications (see above) and the lack of compiler enforcement, it is hard to imagine that a complex class library could actually get the exception specifications right. If the exceptions are "just for documentation purposes" then we think they belong in the documentation rather than the code.

Which leads me to the documentation question. The class library documentation is automatically generated from C#'s XML-based in-source documentation. For instance, the code below can be compiled with the /doc option to produce an accompanying XML file. Here's the code:

  using System;
  class Test
  {

   /// <summary>
   /// The F method does not do anything interesting.
   /// </summary>
   /// <param name="x">the value to F</param>
   /// <exception type="cref=ArgumentException">If the argument is negative.</exception>
   static void F(int x) {
    if (x < 0) 
     throw new ArgumentException();
    Console.WriteLine("F({0})", x);
   }

   static void Main(string[] args) {
    try {
     F(0);
     F(-1);
    }
    catch(ArgumentException) {
     Console.WriteLine("ArgumentException");
    }
   }
  }

Here is the accompanying XML file that is generated:

  <?xml version="1.0"?>
  <doc>
      <assembly>
          <name>scratch</name>
      </assembly>
      <members>
          <member name="M:Test.F(System.Int32)">
              <summary>
              The F method does not do anything interesting.
              </summary>
              <param name="x">the value to F</param>
              <exception type="cref=ArgumentException">If the argument
	      is negative.</exception>
          </member>
      </members>
  </doc>

We then use the generated file to automatically produce documentation.

Q. const and readonly for fields mean pretty much the same
Const and readonly are similar, but they are not exactly the same. A const field is a compile-time constant, meaning that that value can be computed at compile-time. A readonly field enables additional scenarios in which some code must be run during construction of the type. After construction, a readonly field cannot be changed.

For instance, const members can be used to define members like:

    struct Test
    {
        public const double Pi = 3.14;
        public const int Zero = 0;
    }

since values like 3.14 and 0 are compile-time constants. However, consider the case where you define a type and want to provide some pre-fab instances of it. E.g., you might want to define a Color class and provide "constants" for common colors like Black, White, etc. It isn't possible to do this with const members, as the right hand sides are not compile-time constants. One could do this with regular static members:

    public class Color
    {
        public static Color Black = new Color(0, 0, 0);
        public static Color White = new Color(255, 255, 255);
        public static Color Red = new Color(255, 0, 0);
        public static Color Green = new Color(0, 255, 0);
        public static Color Blue = new Color(0, 0, 255);
        private byte red, green, blue;

        public Color(byte r, byte g, byte b) {
            red = r;
            green = g;
            blue = b;
        }
    }

but then there is nothing to keep a client of Color from mucking with it, perhaps by swapping the Black and White values. Needless to say, this would cause consternation for other clients of the Color class. The "readonly" feature addresses this scenario. By simply introducing the readonly keyword in the declarations, we preserve the flexible initialization while preventing client code from mucking around.

    public class Color
    {
        public static readonly Color Black = new Color(0, 0, 0);
        public static readonly Color White = new Color(255, 255, 255);
        public static readonly Color Red = new Color(255, 0, 0);
        public static readonly Color Green = new Color(0, 255, 0);
        public static readonly Color Blue = new Color(0, 0, 255);
        private byte red, green, blue;

        public Color(byte r, byte g, byte b) {
            red = r;
            green = g;
            blue = b;
        }
    }

It is interesting to note that const members are always static, whereas a readonly member can be either static or not, just like a regular field.

It is possible to use a single keyword for these two purposes, but this leads to either versioning problems or performance problems. Assume for a moment that we used a single keyword for this (const) and a developer wrote:

    public class A
    {
        public static const C = 0;
    }

and a different developer wrote code that relied on A:

    public class B
    {
        static void Main() {
            Console.WriteLine(A.C);
        }
    }

Now, can the code that is generated rely on the fact that A.C is a compile-time constant? I.e., can the use of A.C simply be replaced by the value 0? If you say "yes" to this, then that means that the developer of A cannot change the way that A.C is initialized -- this ties the hands of the developer of A without permission. If you say "no" to this question then an important optimization is missed. Perhaps the author of A is positive that A.C will always be zero. The use of both const and readonly allows the developer of A to specify the intent. This makes for better versioning behavior and also better performance.

Q. No inner classes
There is potential for terminology confusion, so let's clear this up up-front. You used the Java terms, and there as good as any so I'll use them too. Java supports both "nested" and "internal" classes.

  • A nested class is lexically nested in another but otherwise has no special relationship. In Java, this is declared by using the "static" keyword on a class declaration that is lexically nested in a class declaration.
  • An "inner" class has special "this" pointer behavior. An inner class can access the instance members of the outer class using "this". An "inner" class is declared by omitting the "static" keyword for a class declaration that is lexically nested in a class declaration.

C# supports nested types. A nested type is declared by writing a type declaration that is nested in a type declaration. E.g.,

    public class A
    {
        public void F() {...}

        public class B
        {
            public void G() {...}
        }
    }
  
    public class Test
    {
        static void Main() {
            A a = new A();
            A.B b = new A.B();
            a.F();
            b.G();
        }
    }

C# does not provide any syntactic support for inner classes. We think the language is simpler and better without it. If the functionality of inner classes is required, it can easily be created through judicious use of constructors. The exact design pattern to use varies a bit based on the scenario, but a common pattern is for the nested type to be publicly visible but not have any constructors that are publicly visible. Instances are handed out by the outer class, which specifies the "outer" during construction:

    public class A
    {
        public B CreateB() {
            return new B();
        }
        public void F() {...}

        public class B
        {
            A outer;
    
            internal B(A a) {
                outer = a;
            }

            public void G() {
                outer.F();
            }
        }
    }

We've written a lot of class library code, and the lack of inner classes hasn't affected us at all. If you think the lack of inner classes is an issue, I would be interested to hear how you use them.

Q. How about a common supertype or interface for all numbers?
All the numbers are value types, so they must inherit directly from System.ValueType. In our type system, it is not possible for a value type to inherit from anything other than System.ValueType. We do provide the IConvertable interface for all base types.

If you had a common supertype for numbers, what would you do with it -- what behavior would be on this common supertype, and how would you use it?

Q. Remove methods for collections should return the removed element
We've gotten this request from some other customers as well. We are considering it.

Q. Where is the documentation for String.Format?
There is no documentation for this functionality in the release that you have. We will have documentation for this in the upcoming beta release. If you'd like more info on this now, we can get it for you.

Q. Where is BigInteger?
We will not provide a BigInteger type in this release. If we did provide one, it wouldn't be a special type in any way -- it would just be a struct with overloaded operators. We expect third parties to fill gaps like this, and this is an important part of the philosophy of C# and .NET. We can't do everything, but we want to provide enough building blocks so that people can either roll their own or buy components from 3rd party vendors.

This a significant philosophical difference from Java, which doesn't allow the definition of value types or overloaded operators even though these features are used by intrinsic types like int.

We're always looking for ways to improve the runtime, though, and might provide a BigInteger type of our own in the future.

--Scott