Software applications are getting more and more complex and each platform for developing applications these days seems like its operating on a higher and higher level of abstraction to hide that complexity from the developer.
In particular, when I say “a higher level of abstraction”, I am talking about “managed languages”. “Managed code” is a term coined by Microsoft to identify computer program source code that requires and will execute only under the management of a Common Language Runtime (virtual machine), typically the .NET Framework, or Mono. However, this also applies to frameworks such as Java and the JVM (Java Virtual Machine). The individual VMs differ from each other, but the core concept of “managed code” remains the same.
I am a big fan of languages/platforms that allow you to operate at a high level, handle your memory management, transform and optimize your code for the hardware it is running on, and much more.
However, I am always questioned from other developers with “Aren’t managed languages slow? Shouldn’t we use C++ for everything?”
It seems to me that this layer of management between your program and the computer’s processor can be a major source of anxiety for developers who assume that it must add some significant overhead.
There are many unfortunate stereotypes in this world. One of them, sadly, is that managed code cannot be fast. This is just not true. What is closer to the truth is that the CLR (Common Language Runtime) or the JVM (Java Virtual Machine) makes it very easy to write slow code if you are sloppy and not careful.
When you build your C#, VB.NET, or other managed language code, the compiler translates the high-level language to Intermediate Language (IL) and metadata about your types. When you run the code, it is just-in-time compiled (“JITted”). That is, the first time a method is executed, the CLR will invoke the compiler on your IL to convert it to assembly code (e.g., x86, x64, ARM). Most code optimization happens at this stage. There is a definite performance hit on this first run, but after that you will always get the compiled version (there are ways around this first-time hit when it is necessary).
In my experience with .NET/Java the overhead is worth it, and the supposed performance degradation is almost always exaggerated. Often, the performance problems developers blame on .NET/Java and other managed languages are actually due to poor coding patterns and a lack of knowledge of how to optimize their programs on this framework.
Skills gained from years of optimizing software written in C++, C, or VB may not always apply to managed code, and some advice is actually detrimental. Sometimes the rapid development nature of .NET/Java can encourage people to build bloated, slow, poorly optimized code faster than ever before. Certainly, there are other reasons why code can be of poor quality: lack of skill generally, time pressure, poor design, lack of developer resources, laziness, and so on.
The bottom line is that the amount of performance optimization you get out of your application is directly proportional to the amount of understanding you have not only of your own code, but also your understanding of the framework, the operating system, and the hardware you run on. This is true of any platform you build upon.
As most things in life, managed code has its pros and cons and obviously is not a silver bullet. Pros being that the developer can focus on what is important to the business and business logic rather than the details of worrying about pointer addresses and thread management. Some of my favorite advantages managed code provides over it’s unmanaged brethren are as follows:
- Advanced Language features – Delegates, anonymous methods, and dynamic typing.
- Automatic memory management – I think this speaks for itself…
- Easier extensibility – With reflection capabilities, it is much easier to dynamically consume late-bound modules.
- Safety – The compiler and runtime can enforce type safety (objects can only be used as what they really are), boundary checking, numeric overflow detection, security guarantees, and more. There is no more heap corruption from access violations or invalid points.
Of course its not all rainbows, butterflies , and unicorns (that would be nice). The cons for manage code are that operating at this level requires understanding the platform that you are running on and how it handles some of the lower level details for you and recognizing how to use the platform to suite your individual needs. Understanding how to optimize your code for the best performance can sometimes be more complicated with managed code if you are not familiar with performance optimization on these platforms.
Performance optimization can mean many things, depending on which part of the software you are talking about. In the context of managed applications, think of performance in four layers (in order of greatest potential impact):
- Algorithms – Here is where your computer science fundamentals come into play. Remember asymptotic time complexity and analysis of functions as their input approaches infinity? This is where the most time is either gained or lost in programming. (If you aren’t familiar with asymptotic time complexity and “Big-O”, there’s a post for that)
- Framework – This is the set of classes provided by the platform owner (Microsoft .Net Framework libraries, or the Java Standard Edition libraries) that provide standard functionality for things like strings, collections, parallelism, or even full-blown sub-frameworks like Windows Communication Foundation. You need to understand how to use your preferred framework and be familiar with most of its APIs.
- CLR/JVM – Below the Framework is the virtual machine that is running your managed code. Understanding how this is implementing and how it works will provide the next level of potential performance gains.
- Assembly Code – This is where the code hits the metal, so to speak. Once the CLR/JVM JITs the IL/byte code you are actually running processor assembly code. If you break into a managed process with a native debugger, you will find assembly code executing. That is all managed code is – regular machine assembly instructions executing in the context of a particularly robust framework.
Remember that when doing performance analysis, you should always start at the top layer and move down.
Make sure your program’s structure and algorithms make sense before digging into the details of the underlying code. Macro-optimizations are almost always more beneficial than micro-optimizations.
If you liked this post, please share it, like it, or subscribe to my blog atjasonroell.com. Doing so helps me decide what is the best content to post and deliver to all you reading! Thanks again, and have a great day!