Tuesday, May 24, 2016

The magic of hiding your NuGet dependencies

Welcome to the dependency hell

While working on a little open-source demo project, I ran into that well-known challenge of NuGet dependency management again. This little project results in a NuGet package, that on itself also relies on other packages. Now, if I would just add those dependencies to the .nuspec file using the <dependencies> element, I'm going to put a burden on the people who want to use my package. Why? Because whenever they use my package, they'll start to depend on all the packages my package references.

If my package is the only one they use, it's probably fine. But what if they use another package that also uses Json.NET (for instance), but they rely on an incompatible version? You can't use two different versions of the same assembly in the same process (or more specifically, the same AppDomain). If I'm using 7.1.3, they are using 7.2.1 and the involved package uses Semantic Versioning (which Json.NET fortunately does), the NuGet Package Manager will happily select the higher of the two. 7.2 implies a backwards-compatible feature update of 7.1. But if they are using 8.1, which implies a major upgrade and potential breaking changes, the NuGet Package Manager will simply give up. Now imagine that my package has a lot more dependencies that can conflict with the dependencies of the code base that is using it. That's what the .NET community typically refers to as "dependency hell".

A way out?

Yes! Merge as many of the assemblies of your dependencies into the main assembly as internal types not visible to the outside world. This has some implications however. For instance, if I would merge Json.NET into my main assembly and the consuming party also uses Json.NET, at run-time, the Json classes would appear twice in the AppDomain. So even though both classes would have the same name and namespace, the CLR would treat them as completely different types. To be more specific, if I would annotate my code with the [JsonConverter] attribute and then merge the Json.NET assembly into my assembly, the other Json.NET as loaded by the consuming party wouldn't be able to recognize the attribute.

What does that mean? Well, it means that you need to consider the circumstances before you make the decision whether or not merge a dependency. Let me help you with that by providing a couple of guidelines:

  1. If the types of that dependency are used on the public types of your package, you must expose that package as a NuGet package dependency. 
  2. If the package and package consumer do have to use the same version of the dependency at run-time, use a package dependency. The above mentioned example of working with Json.NET annotated types is a good example of this.
  3. If the package and package consumer don't have to use the same version of the dependency, then by all means, merge the dependency into the package. You'll make your package consumer a happy person.

Hiding your dependencies

Obviously option 3 is the preferred option, but isn’t always possible. Sometimes you can achieve that by not directly exposing types from your internal dependencies and using smart constructs like delegates instead. For example, let's say your internal dependency has some kind of extension point that consumers of your package would need. Something like this:

public interface IExtensionPoint
{

  void Connect(ModuleInfo module)
}

Your first reaction would be to take option 1 and expose the IExtensionPoint to your consumers. But rather than that, you could also define a custom ModuleInfoAdapter class which mimics some of the properties of the ModuleInfo class and expose a delegate like this:

public delegate void Connect(ModuleInfoAdapter module);

Then when the consumer passes a method or expression into that delegate, you could internally map the exposed ModuleInfoAdapter back to the actual type expected by the merged library.

Another common example is the case where your package internally uses a library that supports selecting and configuring a specific library-provided algorithm (or Strategy) and you need to delegate both to your consumers. You could hide that detail by defining an abstraction on top of that algorithm and allowing your consumer to use some kind of Factory Method to select a particular implementation of that strategy without exposing the internal implementation of it. You'd be surprised to learn what you can do with some smart applications of the Adapter and Bridge patterns, or by simply implementing certain interfaces explicitly. This may all feel like I’m over complicating things, but anything is warranted to keep your dependencies hidden.

To merge or to repack

Within the .NET world there are currently two tools to merge multiple assemblies (and their PDBs) into a single output assembly; ILMerge and ILRepack. The former has been Microsoft's official tool of choice, but hasn't received a lot of love over the last years. The latter is an open-source library that has seen many new releases since it's first inception. Which one to use? It depends. In the beginning, ILRepack lacked in support for certain edge cases, which forced us to switch back to ILMerge. On the other hand, that one didn’t properly support .NET's portable class libraries out-of-the-box. I would recommend to try ILRepack first and see how far you'll get with it. You can find an example of a PSake script that uses ILRepack in my FluidCaching project. Do notice that very recent versions require .NET 4.6, so using the latest and greatest may complicate your build agent requirements. Regardless, we've run into some weird exceptional situations that neither could handle and which forced us to expose an internal dependency as a public dependency anyhow.

So what about you? Does this all make sense? What kind of challenges did you run into while dealing with dependencies and how did you solve those? I'd love to hear your thoughts by commenting below. Oh, and follow me at @ddoomen to get regular updates on my everlasting quest for better solutions.

Wednesday, May 04, 2016

The definitive guide to extending Fluent Assertions

Some background

In my recent post about the responsibilities of an open-source developer I said that the author of an open-source project is fully entitled to reject a contribution. In the case of Fluent Assertions, this is no different. Some things just aren't a good fit for a general purpose assertion library, especially if that feature is tied to a specific type of technology. Some things are a good fit though. The upcoming support for JSon is one example of that. It started as a separate NuGet package, but is about to get merged into the core package.

To facilitate the need for those developers which ideas don't end up in the library, FA offers several extension points. They are there so that they can build their own extensions with the same consistent API and behavior people are used to. And if they feel the need to alter the behavior of the built-in set of assertion methods, they can use the many hooks offered out of the box. The flipside of all of this is that we cannot just change the internals of FA without considering backwards compatibility. But looking at the many extensions available on the NuGet, its absolutely worth it.

Building your own extensions

As an example, let's create an extension method on DirectionInfo like this

public static class DirectoryInfoExtensions
{
  public static DirectoryInfoAssertions Should(this DirectoryInfo instance)

    {
      return new DirectoryInfoAssertions(instance);
    }
}

It's the returned assertions class that provides the actually assertion methods. You don't need to, but if you sub-class the self-referencing generic class ReferenceTypeAssertions<TSubject, TSelf>, you'll already get methods like BeNull, BeSameAs and Match for free. Assuming you did, and you provided an override of the Context property so that these methods know that we're dealing with a directory, it's time for the the next step. Let's add an extension that allows you to assert that the involved directory contains a particular file.

public class DirectoryInfoAssertions :
    ReferenceTypeAssertions<DirectoryInfo, DirectoryInfoAssertions>
{
    public DirectoryInfoAssertions(DirectoryInfo instance)
    {
        Subject = instance;
    }

    protected override string Context => "directory";

    public AndConstraint<DirectoryInfoAssertions> ContainFile(
        string filename, string because = "", params object[] becauseArgs)
    {
        Execute.Assertion
            .BecauseOf(because, becauseArgs)
            .ForCondition(!string.IsNullOrEmpty(filename))
            .FailWith("You can't assert a file exist if you don't pass a proper name")
            .Then
            .Given(() => Subject.GetFiles())
            .ForCondition(files => files.Any(fileInfo => fileInfo.Name.Equals(filename)))
            .FailWith("Expected {context:directory} to contain {0}{reason}, but found {1}.",
                _ => filename, files => files.Select(file => file.Name));

        return new AndConstraint<DirectoryInfoAssertions>(this);
    }
}

This is quite an elaborate example with shows some of the more advanced extensibility features. Let me highlight some things:

  • The Subject property is used to give the base-class extensions access to the current DirectoryInfo object.
  • Execute.Assertion is the point of entrance into the internal fluent assertion API.
  • The optional because parameter can contain string.Format style place holders which will be filled using the values provided to the becauseArgs. They can be used by the caller to provide a reason why the assertion should succeed. By passing those into the BecauseOf method, you can refer to the expanded result using the {reason} tag in the FailWith method.
  • The Then property is just there to chain multiple assertions together. You can have more than one.
  • The Given method allows you to perform a lazily evaluated projection on whatever you want. In this case I use it to get a list of FileInfo objects from the current directory. Notice that the resulting expression is not evaluated until the final call to FailWith.
  • FailWith will evaluate the condition, and raise the appropriate exception specific for the detected test framework. It again can contain numbered placeholders as well as the special named placeholders {context} and {reason}. I'll explain the former in a minute, but suffice to say that it displays the text "directory" at that point. The remainder of the place holders will be filled by applying the appropriate type-specific value formatter for the provided arguments. If those arguments involve a non-primitive type such as a collection or complex type, the formatters will use recursion to always use the appropriate formatter.
  • Since we used the Given construct to create a projection, the parameters of FailWith are formed by a params array of Func<T, object> that give you access to the projection (such as the FileInfo[] in this particular case). But normally, it's just a params array of objects.

Scoping your extensions

Now what if you want to reuse your newly created extension method within some other extension method? For instance, what if you want to apply that assertion on a collection of directories? Wouldn't it be cool if you can tell your extension method about the current directory? This is where the AssertionScope comes into place.

public AndConstraint<DirectoryInfoAssertions> ContainFileInAllSubdirectories(
    string filename, string because, params object[] becauseArgs)
{
    foreach (DirectoryInfo subDirectory in Subject.GetDirectories())
    {
        using (new AssertionScope(subDirectory.FullName))
        {
            subDirectory.Should().ContainFile(filename, because, becauseArgs);
        }
    }

    return new AndConstraint<DirectoryInfoAssertions>(this);
}

Whatever you pass into its constructor will be used to overwrite the default {context} passed to FailWith.

.FailWith("Expected {context:directory} to contain {0}{reason}, but found {1}.",

So in this case, our nicely created ContainFile extension method will display the directory that it used to assert that file existed. You can do a lot more advanced stuff if you want. Just check out the code that is used by the structural equivalency API.

Rendering objects with beauty

Whenever Fluent Assertions raises an assertion exception, it will use value formatters to render the display representation of an object. Notice that these things are supposed to do more than just calling ToString. A good formatter will include the relevant parts and hide the relevant part. For instance, the DateTimeOffsetValueFormatter is there to give you a nice human-readable representation of a date and time with offset. It will only show the parts of that value that have non-default values. Check out the specs to see some examples of that.

You can hook-up your own formatters in several ways, but what does it mean to build your own? Well, a value formatter just needs to implement the two methods IValueFormatter declares. First, it needs to tell FA whether your formatter can handle a certain type by implementing the well-named method CanHandle(object). The other one is there to, no surprises here, render it to a string.

string ToString(object value, bool useLineBreaks, IList<object> processedObjects = null,
    int nestedPropertyLevel = 0);

Next to the actual value that needs rendering, this method accepts a couple of parameters worth mentioning.

  • useLineBreaks denotes that the value should be prefixed by a newline. It is used by some assertion code to force displaying the various elements of the failure message on a separate line.
  • processedObjects needs to be passed to any (recursive) call to Formatter.ToString so that it can detect cyclic dependencies and bail out in time.
  • nestedPropertyLevel is used when rendering a complex object that would involve multiple, potentially recursive, nested calls to Formatter.ToString. It allows the formatter to display its representation using an indented view.

This is what an implementation for the DirectoryInfo would look like.

public class DirectoryInfoValueFormatter : IValueFormatter
{
    public bool CanHandle(object value)
    {
        return value is DirectoryInfo;
    }

    public string ToString(object value, bool useLineBreaks, IList<object> processedObjects = null, int nestedPropertyLevel = 0)
    {
        string newline = useLineBreaks ? Environment.NewLine : "";
        string padding = new string('\t', nestedPropertyLevel);

        var info = (DirectoryInfo)value;
        return $"{newline}{padding} {info.FullName} ({info.GetFiles().Length} files, {info.GetDirectories().Length} directories)";
    }
}

Yeah, I do see that this API has become a bit overcomplicated, but changing it would cause some breaking changes. I'm trying to avoid these until I can find some a great incentive to have people upgrade to a major release.

To be or not to be a value type

The structural equivalency API provided by ShouldBeEquivalentTo and ShouldAllBeEquivalentTo is arguably the most powerful, but also most complex, part of Fluent Assertions. And to make things worse, you can extend and adapt the default behavior quite extensively. For instance, to determine whether FA needs to recursive into a complex object, it needs to know what object should be treated as a complex object. An object that has properties isn't necessarily a complex type that you want to recurse on. DirectoryInfo has properties, but you don't want FA to just traverse its properties. So, you need to tell what types should be treated as value types. The default (naive) behavior is to treat everything from the System namespace as a value type.

public static Func<Type, bool> IsValueType = type => (type.Namespace == typeof (int).Namespace);

But you can easily change that by setting the global AssertionOption.IsValueType function or temporarily using the ComparingByValue<T> options for individual assertions.

Equivalency assertion step by step

The entire structural equivalency API is built around the concept of a collection of equivalency steps that are run in a predefined order. Each step is an implementation of the IEquivalencyStep which exposes two methods: CanHandle and Handle. You can pass your own implementation to a particular assertion call by passing it into the Using method (which puts it behind the final default step) or directly tweak the global AssertionOptions.EquivalencySteps collection. Checkout the underlying EquivalencyStepCollection to see how it relates your custom step to the other steps. That said, the Handle method has the following signature:

bool Handle(IEquivalencyValidationContext context, IEquivalencyValidator parent,
    IEquivalencyAssertionOptions config);

It provides you with a couple of parameters. The context gives you access to information on the subject-under-test, the expectation and some information on where you are in a deeply nested structure. The parent allows you to perform nested assertions like the StructuralEqualityEquivalencyStep is doing. The config parameter provides you access to the effective configuration that should apply to the current assertion call. Using this knowledge, the simplest built-in step looks like this:

public class SimpleEqualityEquivalencyStep : IEquivalencyStep
{
    public bool CanHandle(IEquivalencyValidationContext context,
        IEquivalencyAssertionOptions config)
    {
        return !config.IsRecursive && !context.IsRoot;
    }

    public bool Handle(IEquivalencyValidationContext context, IEquivalencyValidator
        structuralEqualityValidator, IEquivalencyAssertionOptions config)
    {
        context.Subject.Should().Be(context.Expectation, context.Because, context.BecauseArgs);

        return true;
    }
}

Since Should().Be() internally uses the {context} placeholder I discussed at the beginning of this article and the encompassing EquivalencyValidator will use the AssertionScope to set-up the right context, you'll get crystal-clear messages when something didn't meet the expectation. This particular extension point is pretty flexible, but the many options ShouldBeEquivalentTo provides out-of-the-box probably means you don't need to use it.

About selection, matching and ordering

Next to tuning the value type evaluation and changing the internal execution plan of the equivalency API, there are a couple of more specific extension methods. They are internally used by some of the methods provided by the options parameter, but you can add your own by calling the appropriate overloads of the Using methods. You can even do this globally by using the static AssertionOptions.AssertEquivalencyUsing method.

The interface IMemberSelectionRule defines an abstraction that defines what members (fields and properties) of the subject need to be included in the equivalency assertion operation. The main in-out parameter is a collection of SelectedMemberInfo objects representing the fields and properties that need to be include. However, if your selection rule needs to start from scratch, you should override IncludesMembers and return false. The rule will also get access to the configuration for the current invocation as well as some contextual information about the compile-time and run-time types of the current parent member. As an example, the AllPublicPropertiesSelectionRule looks like this:

internal class AllPublicPropertiesSelectionRule : IMemberSelectionRule
{
    public bool IncludesMembers => false;

    public IEnumerable<SelectedMemberInfo> SelectMembers(IEnumerable<SelectedMemberInfo>
        selectedMembers, ISubjectInfo context, IEquivalencyAssertionOptions config)
    {
        return selectedMembers.Union(
                config.GetSubjectType(context).GetNonPrivateProperties()
                .Select(SelectedMemberInfo.Create));
    }

    public override string ToString()
    {
        return "Include all non-private properties";
    }
}

Notice the override of ToString. The output of that is included in the message in case the assertion fails. It'll help the developer understand the 'rules' that were applied to the assertion.

Another interface, IMemberMatchingRule, is used to map a member of the subject to the member of the expectation object with which it should be compared with. It's not something you likely need to implement, but if you do, checkout the built-in implementations MustMatchByNameRule and TryMatchByNameRule. It receives a SelectedMemberInfo of the subject's property, the expectation to which you need to map a property, the dotted path to it and the configuration object uses everywhere.

The final interface, the IOrderingRule, is used to determine whether FA should be strict about the order of items in collections. The ByteArrayOrderingRule is the one used by default, will ensure that FA isn't strict about the order, unless it involves a byte[]. The reason behind that is when ordering is treated as irrelevant, FA needs to compare every item in the one collection with every item in the other collection. Each of these comparisons might involve a recursive and nested comparison on the object graph represented by the item. This proved to cause a performance issue with large byte arrays. So I figured that byte arrays are generally used for raw data where ordering is important.

Well, that's a wrap. What do you think? Does providing extension points like these help you write better unit tests? Are you missing anything? I'd love to hear your thoughts by commenting below. Oh, and follow me at @ddoomen to get regular updates on my everlasting quest for better solutions.