Ausschnitte aus dem Gehirn einer Katze: coding

Posts mit dem Label coding werden angezeigt. Alle Posts anzeigen

Sonntag, 24. September 2017

KittyDI or "How I wrote my own dependency injection container"

Since KittyDI is rather well developed (I mainly just miss the NuGet packaging and publishing stuff and can't find the motivtion to actually do the PRISM-Integration), this blogpost will be more of a retrospective than a developers diary. I still file it as such, because it is planned as the first part in a series about me creating two DI-Container implementations.
But first things first:

How it all began

In the beginning there was... a project at Zühlke. my employer. While I can't disclose much of the project, I can say that we worked on a complicated enterprise application using Autofac and PRISM. At one point we used Autofac to populate a list with ViewModels for the items to be displayed. I thought that to be a rather nice approach as we only had to resolve a function that takes the model as an argument and returns the viewmodel. Autofac then takes care of selecting a constructor where it can resolve all dependencies but the model with types registered to the container and put the model in as last missing dependency.

This all worked pretty well, but it took a long time for the list of viewmodels to fill (with about 100 items in the list of models). So I wondered aloud why it takes so long to create 100 instances of a type and a coworker replied "Well that's obvious. Autofac needs to go through all the strategies of creating the viewmodel for each of the 100 instances."

My first thought was "Really? Can't it just re-use the strategy it used the last time?" Of course I dismissed that thought as being too naive and instead considered implementing my own dependency injection container just to learn the pitfalls of doing so. I just never got around doing that.

In my current project we're using the lightweight Java dependency injection container "Feather" to create an Android project. Inspired by the simple idea that Feather implements (which was basically the same idea I had back then) I decided to finally start that. So yes, KittyDI is inspired by Feather, but no I never used their code as an actual example.
Why I chose KittyDI as name? Because I viewed this DI container as an experiment. A first, naive, maybe even childish try to implement a DI container and to mature by doing so. The relation to felines should be obvious while reading this blog. If it isn't, check the archives.

The goal

My main goal of course was to learn, maybe to fail and see why, to see where the complexety lies and to see if writing a DI container really is as hard as I thought and as people made me think it was. (Spoiler: It isn't.)

For dependency injection my goals were to offer a mixture of the features MEF, AutoFac and feather offer. This means:

Being able to resolve a type which is not yet known by the container
Telling the container which type to resolve when an interface is requested
Registering types as singletons
Resolving all registered implementations of an Interface
Resolving a factory for a type
Resolving a factory that takes parameters
Putting containers inside each other
Creating and resolving containers
Automatic disposal of container contents on container disposal
Resolving generics
And of course remembering how a type got resolved last time

Of these features I only scrapped resolving all registered implementations of an Interface.

The work

I implemented KittyDI in an incremental way.

The first functionality that got added was simple resolving of an unknown type. The approach was to check the constructors of the requested type and selecting the one that fits. There was the first decision: how do I choose which constructor to use. I decided for a simple approach that doesn't force the user to add an attribute to the constructor of each type he wants to resolve (like MEF does). Instead I decided that KittyDI always uses a parameterless constructor if it is there. If there is no parameterless constructor but only a single constructor with parameters, it is used and the parameters are in turn resolved. Only if there is no parameterless constructor and instead multiple constructors taking parameters, the attribute is needed to tell KittyDI which one to use. This does two things at once: it enables the user to simply resolve types without the danger of missing the attribute and getting cryptic exceptions and it makes it possible to use types defined in libraries that don't know about KittyDI. This was a major flaw I personally saw in MEF. After KittyDI decided how to resolve a type, it wraps that in a function and stores it in a dictionary so it doesn't have to search for the right constructor again.

In order to prevent endless resolution loops KittyDI hands the resolution stack (meaning the types of all factories currently in the call stack) to an internal factory and if a factory is called where it's type is already on the stack we are in a resolution loop and throw an exception notifying the caller of the circular dependency.

The next step was rather simple. After refactoring out the functionality which creates the factory for a type if it's not yet in the dictionary, I added the possibility to resolve that exact factory by providing a special function for that. In fact normal resolution turned into "resolve the factory and then execute it". Spoiler: I removed that function after I added generic resolving and turned resolution of a factory into a generic resolved.

Singletons are handled by KittyDI either by setting an optional boolean parameter to true when registering the type of by adding an attribute to the type. If the type should be a singleton, it's factory is wrapped by one that on first call executed the inner factory and then just returns the result of that call on subsequent calls.

The next rather important point was the ability to tell KittyDI which type to use to resolve an interface. This was also pretty naïvely implemented: First the function which ensures that the factory for a given type had been resolved is called for the implementing type and then it is put into the dictionary of known factories as factory for the"contract type". This way it is not just possible to do that for interfaces but for all types provided the implementing type actually inherits from our implements the contract.

Next up was generic resolution. This means that I created a list of "generic resolvers" which intercept requests and instead of searching for constructors on the generic type perform their own logic. This is now used to resolve factories but it was also used to resolve IEnumerable<T> - something that I removed after I also added a generic resolver for Lazy<T>. I wanted nesting of generics to be allowed and IEnumerable proved troublesome here with the architecture KittyDI evolved into. This I scraped resolution of enumerables and moved that to PantherDI where I aim to give the architecture a bit more thought instead of letting it evolve.

Now adding the possibility to resolve a factory which takes parameters was rather easy. I just needed to register a generic resolver which all have full access to the container and trigger resolution with the tires given in the parameters already set. The change I had to make was to add another parameter to my internal factories which is a dictionary of all types provided by the caller. While doing so, I decided to move all the information passed into an internal factory into an object instead of having to change the signature of functions all over KittyDI each time I added something.

To prepare for more complex scenarios, the container supports adding whole containers. This enables the concept of scoping. Each step that is taken to resolve a type (checking the dictionary of known resolutions, checking if a generic type can be resolved and registering the type as new type) will search the DI-Containers that have been added to the container on which the original request was made. Only if the step did not yield any result when performed on the whole tree of containers, then the next step will be started.

Creating child containers then was rather easy. The child container is an empty container which searches its parent before failing each step. After creation, Types can be registered to the child, but they won't be known to the parent.
At the same time resolving a container (resolving the types "DependencyContainer" and "IDependencyContainer") yields a new child on each resolution.
This enables services or view models to register types only to the child and then resolve a helper that uses those.

Last but not least the container supports disposal. If it is disposed, it will dispose all instances of singletons that it has created as well as all child containers too.

Conclusion

As I already said writing a DI container is not that hard. But my main takeaway is - and I think that is a general rule in software development - that you should think beforehand about how you want to achieve your goal unless you want your code to become more and more messy and complicated to read over time.

Even though the PRISM integration isn't done, I see KittyDI as done and the experiment a success (KittyDI can be used in projects) and have many ideas - also on what to think about beforehand - for the mature version "PantherDI" (by now the source of the name should be obvious to the reader).

Further links

Mittwoch, 14. Dezember 2011

On code formatting

Creating a piece of software is a creative process. Just like an artist sculpts a piece of rock with his tools so it looks like the picture of the statue that he has in mind, a programmer has a distinct idea of how his program is going to work. A programmer uses a different way to put his idea into reality than an artist though. He does not need a hammer or other physical tools. Instead he utilizes a computer and writes down his idea in a language that the computer understands.

Of course most these languages have rather strict rules, so a mashine can understand them. But these limitations are a reason why a programmer needs creativity to achieve his goal: He can't just write "Check the weather and tell me what I need to wear today!" The computer does not know what "weather" is and so he does not know how to check the weather. So the programmer needs to tell the computer what "weather" is and what it needs to know to determine which clothes (another thing that needs explanation for a computer) to wear. So a good program that uses online weather forecasts to tell which clothes to wear will end up in lots of files written in a programming language.
These source codes are very easy to read for a computer and they also can not be misunderstood by it because everything has only one single meaning. Since a computer reads the code differently, in most of the languages it does not matter how big the spaces between the words in our sentences are. Most of the time it does not even matter if we write everything in one long line which you could only write on a very very long roll of paper. But writing everything in a long line makes the code which already is not that easy to read even harder to understand. Thus programmers got together and agreed: "We shall write our programs, so they are easy to read." And they talked about how one could do that and wrote down their results. These ideas that they had are now known as conventions of style between programmers. And mind you - there is not only one set of conventions of style, but there are many different for each programming language and every single programmer adapts the rules that he thinks are best for his own code.
Although I already wrote much about those conventions I don't want to go into detail here. Even though this entry is titled "On Code Formatting", I don't want to talk about how to format your code right. I personally don't believe that there is only one single way of doing it right. Instead there are lots of different styles which all serve different purposes.

Some - especially new - programmerss are too lazy or hasty to do proper code formatting. This is something that I've seen especially in students that hand in solutions for work assignments. And yes, I have even done this myself - and I am still doing in some circumstances. The reason for this is rather simple: Code Formatting means more work while writing your code - which is the only thing you see while you are doing it. The big tradeoff is later when you have an error in the code and you want to find it. In good formatted code you can find a specific thing much easier.

*** Warning: From here on I will start ranting a bit. ***
Nowadays you can choose from many different programs to write your code in. Programmers call these "IDE"'s. Some of those IDEs like NetBeans, VisualStudio, Eclipse, etc. come with a hell of functions that you'll probably never need and others simple text editors as notepad, gedit, vim, etc.
Those functions also cover code formatting.

A simple example: When you write your code and you start a block like "Do this block of instructions only if it is the 3rd of november: ...", it is agreed that you indent the instructions inside the block, so you can easily see what is inside and when the block ends. If you press the enter key to start a new line, very simple editors like Windows' notepad go to the beginning of that line. Other editors - especially those that were made for programmers - start a new line at the same indentation as the previous one, so you don't need to press a lot of spaces to get where you want to.

Other programs go as far as to automatically add spaces after commas, mathematical operators (+, -, etc.) and the like and some even add line breaks for you automatically in specific cases. Of course this leads to a dispute about whether those functions are good and helpful or whether they are harmful and bad.

While a lot of programmers say "A good programmer should be able to write well formatted code on himself, thus those programs are made by the devil." I beg to differ. As a programmer I am - as stated above - undergoing a creative process and I already am restricted in my actions by the programming language itself (just as a musician who is restricted by his instrument - he will never be able to make piano sounds coming out of his violin) while writing my code. I need to think about how to make the program do what it should (we programmers call this "solving the problem") and when I have to heed the code formatting rules, I need to cut thinking power from "How am I going to make this work" to "Oh, is this written neatly, so everyone can read it?" Especially when the opinions of what neat code actually is differ. So if my IDE starts formatting my code for me as I write it (or when I tell it to, if I don't want it to happen automatically) I can concentrate on what my work is all about: Explaning the computer what it has to do in a language that it understands. In my personal opinion having to bother about correct indentation, spacing and so on is a burden to a programmer. I think now that we are living in the year 2011 we should leave those tasks to someone who can do it much more reliable and faster: The computer. I mean - that's what they're for, aren't they? Okay - apart from porn and video games. But when a programmer writes a program which is to become a tool for someone else then he is doing something to make the life of that person easier by being able to use the computer for something he would have to do by hand otherwise.

It seems that programmers want to be the exception. Actually the reason I am writing this is because someone told me that people who make tools that automatically format your code don't know anything about programming. I think it's exactly the opposite: They understood what being a programmer means. Programming isn't just the process of writing down your code into a file and then letting the computer "translate" it into an internal language, so it becomes a real program. It is a process of thinking about how you are going to tell the computer what to do. Especially during "debugging" - which is finding errors and mistakes in programs - you often find yourself in the situation where you don't understand what could be wrong. Sometimes the approach that you take is to simply recode a specific part in a different way - just like when you take a paragraph from a novel and replace it with a new one which contains the same content but is written in your own words. But it is not about concentrating on writing neat code - especially not when it's proven that the computer is able to do that for you (in some IDEs even so highly configurable that it will use your personal style for the autmatic code formatting).

Well of course not all IDEs have these functions and those that have them don't support every language - but to me a simple question arises: "Why?" Of course for little languages nobody bothered of writing routines that do that for you but there are bigger languages that are common don't have a single IDE that supports automatic code formatting. It seems as if the people "hating" this function block the idea of adding it to the IDEs...

I have actually never heard any arguments for formatting the code yourself which is kinda sad, because it gives one the (hopefully wrong) feeling that people that are strong against auto-formatting (and I am talking about those who want to deny the use of that function to others) are afraid that they are using a kinda elite-position as coding becomes easier and easier. Either that or they are jealous that they didn't have those functions "back in their days". So the (unspoken) argument would be: "No, you can't take that hardship away from programmers or else programming would become mainstream" - or "No, you can't give them that feature - they already have it easy enough not having to punch holes in big stacks paper cards to code their programs". Nothing about punch-cards by the way. I personally think they are cool, but - they are as outdated as having to format your code by hand.

Well this rant would not be complete with a little clarification to those who like formatting their code by hand: I don't have anything against doing it voluntarily. I just can't understand why it is needed to force others to do it also.
And also I need to add that I am open to arguments for having to do it yourself as long as they are not "So you are able do it when you don't have the IDE that does it for you" - seriously: IDEs can be changed, so they support it this just isn't done - or "Because I didn't had that function too."

Thank you for reading so far. I will post further rants - erm - posts about other topics when I feel like it.