Dynamic language developers who are concerned about performance end up writing pieces of their applications in C++. So if you’re going to write C++ anyway, why not write your entire application in C++?
Library writers develop in C++ so that their users won’t have to. That makes a lot of sense. But if you’re developing your own application, not a library, maybe it would be better to write everything in C++ instead of wrapping C++ in something else.
A few years ago an immediate objection would be that C++ is hard to use. But with the advent of C++ 11, that’s not as true as it once was. C++ has gained many of the conveniences traditionally associated with other languages.
Dynamic languages are designed for programmer productivity[1] at the expense of efficiency. If you can afford that trade-off, and quite often you can, then don’t worry about C++. But if you’re using a dynamic language and C++, maybe it would be easier to just stick to C++. Of course the decision depends on many factors — how much of the application needs to be efficient, the size and skill of your team, etc. — but I suspect more projects will decide to do everything in C++ because of its new features.
* * *
[1] “Productivity” implicitly assumes productivity at a certain set of tasks. If your tasks fall into the set of things a language was designed to support, great. But a “productivity” language may not improve your productivity if you don’t meet the language’s intended profile.
John, at Prior Knowledge, we build all of our core pieces out of (awesome, fabulous, Boost-using) C++, and then wrap and glue the pieces together using boost::python. It gives us the best of both worlds, especially because put a lot of slower-meta-control logic in python (easy and quick to change, esp at runtime, for reading in config files, etc).
Doing this, we’ve been able to scale MCMC for nonparametric Bayes to billions of datapoints, and still provide a great python interface for our server/cloud/testing/verification folks. boost::python really is magic.
To address the comment made by Itman, Boost is an amazing set of libraries for C++, but you’re right, a lot of the build/deploy tools for C++ lag a bit (We use CMake). Unless you’re on windows, of course, where Visual Studio is the bee’s knees.
Itman, I’m sure there are some domains where the java legacy codebase is pretty severe, just like I’m sure there are some where the C++ legacy base is severe :) We like c++ party because it’s so easy to glue real, first-class dynamic languages on top of it. Sadly, jython/jruby never seemed to quite work as well as we wanted. But you’re right, access to existing libs/code matters a lot.
The problem is that a lot of tools/libraries is now written in Java, not in C++.
I was a C++ engineer for 10 years. Without false modesty, I was pretty darned good at it. I could use partially specialized templates, write my own memory allocators for special purposes, write new iterators for complex data structures, etc. with ease.
Now that I program in Ruby on Rails I would NEVER go back to C++.
The joy I find in writing Ruby is unparalleled. There’s almost no translation step from idea to implementation.
It’s FUN and PRODUCTIVE in a way that C++ never was.
Dan: I don’t find writing C++ onerous, so I’m not willing to go to too much effort to avoid writing. However, I’ve been using it for a long time. If I had to learn C++ now from scratch, I’d find it incredibly intimidating. So I understand the desire to avoid learning C++ better than the desire to avoid writing C++. For those who have to learn and use C++ anyway, it doesn’t seem that hard to write more of it.
Dean: I don’t know about what tools should use C++, but here’s a list of products that do use C++: Programming languages beacon.
John: The analogy with assembler breaks down. C++ is one or two orders of magnitude faster than many other languages, but assembler is not that much faster than C++. In fact, C++ with a modern optimizing compiler will be faster than assembler unless you are very skilled at writing assembler. See the link in my reply to Dean. Most people have concluded that C++ is sufficiently low-level for most performance-critical software.
But C++ is not the only option. There’s also good old C (without ++, but also without any luxury), and for the brave ones Google’s go (with feels as dynamic as a static language can be).
And, of course, real programmers who care about performance write assembly code ;).
Eric,
They don’t just lag. They don’t exist. E.g. a lot of NLP things are done in Java now. If you want to stay on the shoulders of Giants you just have to use Java. I second that it is much easier to glue than to stick to a specific language.
If you don’t need the performance of C++, and Ruby makes your work easier, then by all means use it. My subject here is the people who are writing Ruby (or Python, or R, or …) and also spending a lot of time writing C++ out of necessity. The justification for this is usually writing code for others, creating efficient components for others to string together.
But I’m thinking that if I’m writing something entirely by myself, and I’m going to be developing some of it in C++, I might as well do it all in C++. I see people write something in Python/NumPy, then tweak the Python code, then rewrite parts of it in Cython, then write more of it in Cython, etc. I personally don’t want to work that way. If I need C++ performance, my first choice would be to just write C++.
Maybe being on a hybrid native/Lua (with an emphasis on the Lua) app for many years now has tainted my worldview, but I actually think of building entire (large) application in C/C++ as a kind of premature optimization.
I’d actually go a bit further and compare it to deciding to build an app that mostly ran in kernel space instead of user space. The level of rigor required to translate from an idea into an implementation is higher and the consequences of small mistakes are as well (memory corruption/dangling pointer type bugs).
Of course, in our case there’s been a substantial Lua tooling investment that probably isn’t something everybody can afford to do and the dual-language nature would exact a higher complexity cost without that…
Clearly your hopeful for C++’s future. I have found no joy, and dwindling satisfaction, in C++ and it’s supporting tools, so I’m not at all interested until C++ 11 until it’s consistently supported by mainstream platforms.
Would you argue that bittorent or the modern DVCS tools should have been written in C++? Can you provide more examples of tasks and work that are best suited for C++ vs. Python/Ruby/C#/Java?
Programmers often use convenient languages like C++ because they’re easier. But, when performance becomes absolutely necessary they switch over and write sections of the code in assembler. This makes not sense, why not just write the whole thing in assembler?
Yeah, I suspect if I spent more time writing C++ it’d probably lessen the idea translation effort substantially. Also, I definitely would not take introducing an an additional language into the mix lightly, so I can definitely agree there’s cases where adding a scripting language in the mix adds more complexity (API boundaries, bindings, tooling hassles, etc) than it pays back in development ease.
After watching a ton of C++ sessions at Microsoft’s recent Going Native conference I’m much less enthused about the C++ 11 language innovations than I was when I had first heard of them. The new features haven’t actually made C++ safer or simpler, they just made it bigger and even more complicated because they coexist with all the old cruft. You must take ever greater care to navigate an ever-expanding set of options, or everything comes crashing down — in ever more incomprehensible ways.
At this point I think C++ is effectively unfixable due to its requirement for backward compatibility with portable macro assembler that’s half a century old. More features (even good ones!) on top of a fundamentally unsafe and already overcomplicated language are not a solution, they just aggravate the problem.
My main CS prof in college swore by compiled languages and thought dynamic languages should never be bothered with. Since several years of C++ with him, I haven’t touch it since. Maybe it’s time for me to give it another shot.
I’m fairly skilled at C++ and I’m working on a little C++11 project right now. It’s a big improvement over C++03, but it’s not nearly as easy to use as python.
One thing to remember is that even if you do, most people don’t have a good grasp of C++ or C++ ideas like templates, r value references, STL design etc. The truth is, I’d used C++ for years before I really felt like I had a good handle on it on all aspects of the language, and I’m still learning new things all the time.
On the other hand, python is a language that is easy for even non-professional programmers to understand. Wrapping high performance C++ code in python is easy and it lets non-C++ experts take advantage of that performance in a team environment, or when distributing a library.
When I think of the intersection of development speed and software performance, I think nobody is more focused on that intersection than the games industry.
And, game companies do write their physics and graphics engines in C or C++. But for large games, the actual majority of the code — UI and behaviors — tends to be a layer above, often in something dynamic like Lua, Lisp or Python.
The “Programmer Beacon” site is more than a little disingenuous in that regard, pretending that an application with a C++ kernel and most of its real behavior in a dynamic language is a C++ app. Looking through it, there are several instances of projects that I know contain much more dynamic language than the table lets on.
I guess the reason that C++ and other languages are mixed together in a project is mainly because you need different characteristics at different part of it.
On the web server front-end, you certainly want lightspeed to filter out unwanted requests and route them to backend servers, that’s where C++ is great at.
On backend servers, you want stability, there is where you want managed code (i.e. Java/.NET) which help you with garbage (or resource) collection.
If you are targeting domain experts, you might want to expose dynamic languages based DSL, where the ‘dynamic’ part work great there.
Writing the whole system in C++ is a lot more cost, and you might never get to the level of stability you want with managed code, despite you might work very hard to get to the ‘just last memory leak bug’.
If you are willing to write part of your code in something like C++, it is because you have a piece of code that needs to be as fast as possible. And it has to be code where you can’t find any good algorithmic improvements: you can’t be clever about what you’re doing, so you settle for doing the same thing, but faster instead. That happens with two kinds of functions: large, complex functions that have had the benefit of extensive algorithmic optimization, perhaps over the course of years; and pieces of code small enough that there is clearly no low-hanging optimization fruit there to be had. I posit the second case is by far the most common one.
And for small bits of very fast code, C++ is either overkill or insufficient. You can use plain C with no loss of expressiveness or structure, and with less overhead – a tight loop of a hundred lines of code will look much the same whether C or C++ in practice. Or, you _really_ want to be fast, in which case you write architecture-optimized assembly and/or grab whatever extra cores, GPUs, local clusters or other resources your system has available (this is the kind of thing libraries like ATLAS does).
If you are going to the effort of using C++ in addition to a high-level language you might as well go all the way, not stop halfway there.
Any sufficiently large or complex application has to be highly customisable by users. This often means a scripting system. So if you know that you will have to implement and maintain some kind of scripting layer anyway, it makes sense to use it for any part of the application where it makes sense to do so.
One other thing to consider is that code in a scripting language is a kind of data. If your system is inherently data-driven, it makes sense to go all the way and encode high-level behaviour (e.g. policy, as opposed to mechanism) as data, too.
C++11 shifts the sweet spot, no doubt about it. But it doesn’t shift it all the way to the end of the spectrum for every application or product line.
C++ is for sissies. Real Men Write C. :)
Actually, I recently discovered, when writing some Groovy for fun, that there were real advantages in using a language that properly and natively supports unsigned arithmetic: having every 0xFF byte cast as -1 (as happens in Java/Groovy) and not treated as 255 is something of a pain.
On the other hand, it is quicker to get from a three line use case to some working code with a functional language like Groovy.
Based on Comment #6 it sounds like a potential case of premature optimization.
Performance optimization is an iterative process, so whether you spend it optimizing by re-writing critical sections in different languages / interfaces (NumPy vs. Cython) or spend it iterating through revisions to debugging and even just correctly implement a non-trivial algorithm in a “lower level” language (C++) – I don’t think you necessarily gain enough of a performance improvement to justify developing the entire application in the lower level language, assuming that the “higher level” language (Python) you feel is quicker and easier to write the application in.
I think gave your own hint of this answer in Comment #11 as you indicate that moving even further in language evolution does produce diminishing returns of performance improvements versus development time and effort.
What I would consider rather than an application and library style division of labour would be to consider writing a kernel (from the OS sense – is not just an interface of low-level routines that are called by the application, but also the application’s core foundation upon which the application is built – the analogy works better in a application suite or environment situation where several applications share the problem domain and require the same or related data to be processed in similar fashion) or as an engine (particularly thinking of its usage in the video game development usage, e.g. graphics or rendering engine, physics engine, game engine, audio engine) in the low-level language (C or C++ or even assembly where appropriate) and leave the application-ee portion to be
in the higher level language of your choice (Python, Ruby, etc.) such that the portion of the application that is most likely to evolve over time (I my case this has typically been the either the presentation layer or the extension of the original functionality) can be easily changed without having to remember the gory details of the low-level code which requires more time to re-read and re-understand the implementation details and decisions at a much later date.
I am not familiar with the Cython interface to C/C++, but if it is as simplistic but if it is as icky as Perl’s XS, then there may be a potential cost to the inter-language interfacing that is does need to be considered. From my reasonably limited usage of Perl’s XS, I don’t think it is reasonable to assume that an average “Perl programmer” would be necessarily familiar or competent to maintain the interface and I’ll guess that it is similar for Cython. Both look intimidating for someone who isn’t an experienced C/C++ programmer (reasonable given its usage I admit), which means that the interface layer itself to be maintained by a potentially a smaller pool of suitable developers fluent in both languages. If this is professional rather than personal software, maintenance and life cycle need to be considered because any useful application may extend beyond its original purpose and usage as well as beyond yourself.
I learned this lesson in my first job, when I was given a piece of COBOL code to port to a new hardware platform and compiler environment (64-bit system back in the 1990’s), it was run only once a year – so it was hard to justify re-writing the program in a new language, but its output was a specific custom machine-readable fixed field text format to necessary to satisfy federal reporting requirements, and none of the then staff were either familiar with COBOL, such as myself, or available to do the code spelunking necessary for a re-write and testing.
i have been coding in c++ for several years and using python for prototyping recently. i agree with one of the commenters above that many significant libraries are written in java and not c++. if you do not know java or can’t use it and you can’t use python libraries either, then you are condemned to re-invent the wheel everyday in c++. its not just NLP or ML libraries, not being able to use latest builds of Solr/Lucene, Nutch, Hadoop etc can be very frustrating for a c++ programmer.
From trenches of .NET programming where I work, it’s interesting to observe that a rewrite to C/C++ is about the last performance optimization considered, if at all; most teams of developers I’ve seen will anything short of that, e.g. go as “low-level” as they can with their languages and APIs, or implement their own caching mechanisms. In fact, in this domain the switch in languages tends to go in the opposite direction. Arguably performance-critical legacy systems written in C are phased out with high-level-language replacements, usually over time.
So that’s a good question, “why not write [or leave] your entire application in [C] C++?”
Well Webster, the main reason why whole applications are not written in C++ is cost. The _relative_ cost of the team who can effectively write it and _relative_ cost of the tools needed to make professional products with it. Java and dynamic language programmers are cheaper and much more _abundant_ and the third party libraries are frequently free (as in free beer).