What exactly is causing the C to run over 30 times slower? If you don't enable optimizations, any benchmarking you do is completely worthless. (And if you do enable optimizations, the loop gets optimized away. So your benchmarking code is flawed too. You need to force it to run the loop, usually by summing up the result or similar, and printing it out at the end)

It seems that what you're measuring is basically "which compiler inserts the most debugging overhead". And turns out the answer is C. But that doesn't tell us which program is fastest. Because when you want speed, you enable optimizations.

By the way, you'll save yourself a lot of headaches in the long run if you abandon any notion of languages being "faster" than each others. C no more has a speed than English does.

There are certain things in the C language that would be efficient even in a naive non optimizing compiler, and there are others that relies heavily on a compiler to optimize everything away. And of course, the same goes for C or any other language. A bad C compiler will generate slow code. What about a C compiler which generated C code, which you could then run through a C compiler? How fast would that run? Languages don't have a speed. Your code does.

I'll keep it brief, it is already marked answered. C has the great advantage of having a well defined floating point model. That just happens to match the native operation mode of the FPU and SSE instruction set on x86 and x64 processors. No coincidence there.

Native C/C is saddled with years of backwards compatibility. The /fp:precise, /fp:fast and /fp:strict compile options are the most visible. Accordingly, it must call a CRT function that implements sqrt() and checks the selected floating point options to adjust the result. That's slow.

I'm a C and a C developer. Firstly, C code will NEVER be faster than a C application, but I won't go through a lengthy discussion about managed code, how it works, the inter op layer, memory management internals, the dynamic type system and the garbage collector. Nevertheless, let me continue by saying the the benchmarks listed here all produce INCORRECT results. When specifying the project properties on your C application, make sure you enable "full optimization" and "favour fast code". If you have a 64 bit machine, you MUST specify to generate x64 as the target platform, otherwise your code will be executed through a conversion sub layer (WOW64) which will substantially reduce performance.

Once you perform the correct optimizations in the compiler, I get .72 seconds for the C application and 1.16 seconds for the C application (both in release build). Since the C application is very basic and allocates the memory used in the loop on the stack and not on the heap, it is actually performing a lot better than a real application involved in objects, heavy computations and with larger data sets. Even with this bias, the C application completes in just over half the time than the equivalent C application. Keep in mind that the Microsoft C compiler I used did not have the right pipeline and hyperthreading optimizations (using WinDBG to view the assembly instructions).

Now if we use the Intel compiler (which by the way is an industry secret for generating high performance applications on AMD/Intel processors), the same code executes in .54 seconds for the C executable vs the .72 seconds using Microsoft Visual Studio 2010. So in the end, the final results are .54 seconds for C and 1.16 seconds for C. Most of the time spent in the .54 seconds was in getting the time from the system and not within the loop itself!

What is also missing in the statistics is the startup and cleanup times which are not included in the timings. C applications tend to spend a lot more time on start up and on termination than C applications. What I found is that without putting some additional code in C, most of the code in the examples above were actually removed from the binary. This was also the case with C when you used a more aggressive optimizer such as the one that comes with the Intel C compiler. The results I provided above are 100% correct and validated at the assembly level.

The main problem with a lot of forums on the internet that a lot of newbie's listen to Microsoft marketing propaganda without understanding the technology and make false claims that C is faster than C. The claim is that in theory, C is faster than C because the JIT compiler can optimize the code for the CPU. Furthermore, an experienced developer will know the right compiler to use for the given platform and use the appropriate flags when compiling the application. On the Linux or open source platforms, this is not a problem because you could distribute your source and create installation scripts that compile the code using the appropriate optimization. On the windows or closed source platform, you will have to distribute multiple executables, each with specific optimizations. The windows binaries that will be deployed are based on the CPU detected by the msi installer (using custom actions).


