Which language is faster on z/OS, Java or C++? People will tell you C++ is fast and Java is slow, but does that stand up to a drag race?
Dave Plummer is a retired operating systems engineer from Microsoft. He has created an interesting series of videos “drag racing” different languages and different hardware with a small program searching for prime numbers. The initial video raced C++, Python, and C#. Then he raced an Apple M1 vs an AMD ThreadRipper 3970X vs a Raspberry Pi.
I thought it would be interesting to run the drag race on z/OS, putting C++ up against Java. z/OS people like to tell you that Java is slow – but is that really true?
The program uses the sieve of Eratosthenes to search for prime numbers. The program works through odd numbers starting at 3 and marks each multiple as “not prime”. Then it moves to the next number that has not already been marked as a multiple of another number and repeats the process. At the end, numbers that have not been marked are prime.
This is repeated for numbers up to 1,000,000 as many times as possible in 5 seconds, and the number of passes is the result.
The “drag race” description acknowledges that this isn’t a comprehensive benchmark, just a test of speed at a particular task like drag racing a car.
Setup
The C++ and Java programs had been developed and refined on other platforms. The Java code ran without modification, but the C++ code required a few changes:
- I couldn’t find <chrono> on z/OS so I used gettimeofday for the timing
- Some changes to initialization etc. were required due to unsupported syntax
The C++ code was compiled from the unix command line:
xlc -o PrimeCPP31 -O3 -Wl,xplink -Wc,xplink,-qlanglvl=extended0x PrimeCPP.cpp
I configured the zIIP offline for the tests so that the C++ and Java code were running on the same processor.
All source code is available here, if you want to try it out on your own system:
https://github.com/andrew890/Primes-zOS
Note: z/OS CPU speeds vary widely based on the capacity purchased. The z15 LSPR ratios list z15 systems with single CPU MSU ratings from 12 MSU to 253 MSU – a 20x difference! The numbers here should be a reasonable comparison between the languages tested, but be careful comparing them with a different system.
Round 1
Source code:
- C++ : https://github.com/andrew890/Primes-zOS/blob/main/PrimeCPP/solution_1/PrimeCPP.cpp
- Java : https://github.com/andrew890/Primes-zOS/blob/main/PrimeJava/solution_1/PrimeSieveJava.java
Results (higher number is better):
C++ | Java |
1295 | 4807 |
I was surprised – I expected Java to do well, but I didn’t expect C++ to do so badly.
There wasn’t anything I could see in the C++ code to make it slower than the Java code. However, marking and checking numbers is the majority of the work, and this processing is hidden inside a vector<bool> in the C++ code. Using vector<bool> was apparently a big gain on other platforms, but maybe not on z/OS?
I changed the C++ code to use bits in an unsigned char array, explicitly testing and setting bits. This was the method Dave used in his initial code. The Java code used a boolean array. To give the closest possible comparison between C++ and Java I also changed the Java code to use a byte array with the same bit testing/setting.
Round 2
Source Code:
- C++ : https://github.com/andrew890/Primes-zOS/blob/main/PrimeCPP/solution_2/PrimeCPP.cpp
- Java : https://github.com/andrew890/Primes-zOS/blob/main/PrimeJava/solution_2/PrimeSieveJava.java
Results (higher number is better):
C++ | Java |
4828 | 2715 |
This was a much better result for C++. It looks like the vector<bool> implementation on z/OS is not as good as other platforms. However in Java the original solution was much better. The improved C++ version didn’t significantly beat the original Java solution.
On other platforms C++ was faster than Java by 40-70%. The versions using the byte array showed a similar margin. I don’t doubt that you could write a C++ version to beat the fastest Java version on z/OS, but I don’t think it would be easy.
Bonus: COBOL
Someone contributed a COBOL version. I tried that out of interest, compiled with OPT(2):
Source Code:
Result:
COBOL |
2373 |
Better than the worst C++, but not as good as Java. To be fair, this program is a long way from the type of work COBOL was designed for. I don’t know COBOL well enough to judge if it could be improved.
Scaling it up
The other interesting test is to scale up from 1,000,000 to larger numbers. I repeated the tests using the different solutions for primes up to 10,000,000, 100,000,000 and 1,000,000,000.
The most interesting result here is the Java boolean[] version. This version is as fast as the fastest C++ version for 1,000,000, but the speed declines much faster as the maximum increases. I guess Java is doing some optimizations that don’t work as well for 1 billion element arrays!
The trend was strong enough that it seemed interesting to try a smaller number as well, so I added a 100,000 run. Very interesting – for 100,000, the Java version using the boolean array was more than 20% faster than C++!
100,000 | 1,000,000 | 10,000,000 | 100,000,000 | 1,000,000,000 | |
C++ using vector<bool> | 14,377 | 1,295 | 122 | 10 | 1 in 7.19 seconds |
Java using boolean[] | 64,423 | 4,807 | 271 | 13 | 1 in 9.06 seconds |
C++ using unsigned char* | 52,251 | 4,828 | 417 | 28 | 2 in 5.14 seconds |
Java using byte[] | 30,425 | 2,715 | 237 | 14 | 2 in 6.99 seconds |
COBOL | 19,270 | 2,373 | 86 | 5 | 1 in 21.0 seconds |
Java Overhead
Java has some overhead starting the Java Virtual Machine. This can be seen in the SMF data.
The SMF data shows the C++ programs had about 4.95 seconds CPU time and 5.02 seconds elapsed time for the 5 second duration measured by the program.
The Java programs had about 5.24 seconds CPU time and 6.16 seconds elapsed. This presumably reflects the overhead of starting the JVM. There was only one CPU online, so any runtime overhead after the program records the start time will be reflected in the score. Java GC etc. threads could not run in parallel on another CPU and accumulate CPU time without slowing the main program. This startup overhead should be less significant for longer running programs.
Conclusion
Java on z/OS is not slow. It can match C++ for speed, to the point where the selection of algorithms and data structures is more important than the language itself. Java deserves to be considered a high performance language on z/OS, as much as C++ or COBOL. There is one caveat: there is significant overhead starting the JVM, so it might not be a good choice for small programs that run very frequently.
Java’s reputation for being slow probably comes from the ease of combining existing components into very large applications, where the programmer may not even be aware of the size of what they have built.
Many z/OS systems have general purpose CPs running less than full speed to reduce software bills. If you have zIIPs running full speed, Java might actually be the fastest language on your system by a fair margin, with the bonus that the Java work probably doesn’t contribute to software costs.
Dave’s Videos
Here are direct links to the first 2 of Dave Plummer’s Software Drag Racing videos: