Erlang vs. Java message passing times
by vijay • March 31, 2009 • Erlang, Java, News • 13 Comments
I benchmarked ring programs in Erlang and Java for various values of N (number of threads) and M (number of messages). You can see my Erlang solution here and Java solution here. For N, I ran 10, 50, 100, 500, 1000, 5000 and 10000 threads. The values I used for M are 10, 100, 1K, 10K, 100K, 1 Million.
The results are available in Google Spreadsheet (no login necessary):
http://spreadsheets.google.com/ccc?key=p6lbXDA9b-EEO5KGTm2r-Ag
The report can also be viewed in HTML format:
http://spreadsheets.google.com/pub?key=p6lbXDA9b-EEO5KGTm2r-Ag&output=html
Note that the times published are wall clock times. Here is my hardware info and the commands I ran:
| Environment | Pentium(R) D 3.00GHz, 1GB RAM, Dual Core, Windows XP |
| Java params | java -server -Xss1K |
| Erlang params | erl (didn’t need -p) |
The following 2 graphs show times (in millis) when 10, 100, 1K, 10K, 100K, 1 Million messages were passed around in a ring of 5,000 threads.


Lessons learned:
I know this is a micro benchmark but still it was a good test to compare message passing in Erlang and in Java.
Apart from the usual – Java slow, Erlang fast – stuff, which we all know by now, I wanted to say I was pleasantly surprised that writing the solution in Erlang was a lot easier than writing in Java. Not having to worry about threads sharing objects made it easier to think and write code. I really appreciated tail recursion is Erlang. As you can see, it takes fewer lines to solve the problem in Erlang. I’ve come to like the Actor model more.
I should also say that the Java solution would have been messier if not for java.util.concurrent classes. Without java.util.concurrent classes it would have been difficult to test and verify that the solution actually worked.
I could not run more than 5,000 threads in Java so I used -Xss1K option and was able to run 10,000 threads. In erlang however, I could create 30,000 processes. The size of a process in Erlang is just a few bytes.
Lastly, I should mention that while running 10,000 threads in Java the process size of JVM was close to 200MB. The process size of Erlang VM while running 10,000 processes was just about 30MB.
Would be nice to redo in Actorom for example.
Cheers
Stephan
http://twitter.com/codemonkeyism
I would especially like to know about “I wanted to say I was pleasantly surprised that writing the solution in Erlang was a lot easier than writing in Java.” in an Actorom/Erlang comparison.
Because lots of people have compared speed, view have compare ease of use.
(I suspect Actorom to be worse in ease of use, but how much is the question, unusable?)
Cheers
Stephan
Why would you bother running so many Java threads? Do you also run 10,000 windows processes on your machine? Did you attempt to run erlang with +A 1024 and +S 1024? That would create a comparable level of OS scheduling and be equally as irrelevant to obtaining meaningful data. A Java thread != an Erlang process, treating them the same doesn’t compare real usages of these systems.
What would be interesting would be to compare raw throughput with each VM running how it was designed to run, not staged in a manner where one is likely to fail and the other is not.
Implement this using Java 5 Executor tasks not threads, or better yet Kilim microthreads and mailboxes.
I do admit that the Erlang implementation will still be much more straightforward, but either of those alternative techniques will be closer to reality than these graphs.
Hello Erik,
Thanks for your comments. Regarding why I needed so many threads, I wanted to see the breaking point. I am working on a web application, which I am writing in Erlang. I have used Tomcat and JBoss clusters for a long time and I chose Erlang because I can get the benefit of JBoss clusters with a single Erlang VM. You can tweak server.xml in Tomcat and raise the max threads etc. The default in Tomcat is 150 threads but I find Erlang wins in this contest hands down.
No I did not use +A 1024. Why? I actually tried reducing the stack size of a thread in Java but I could not start the JVM with stack size less than 1K.
In running the tests, I tried to use the minimum amount of resources possible. I tried to use the best of what I could get in both VMs to accomplish my task – to pass messages around a ring of threads.
Yes, Erlang process != Java thread. Does it really matter to me as a programmer? See, I want X number of “programs that have their own life”. I want them to pass around a message. You use threads in Java and processes in Erlang, XYZ in another language. I don’t care what XYZ is as long as it can do the job quickly and perform well with less code.
It’s not fair to say that the tests were staged. I wrote those programs as I would normally write any program. I ran them just as I would run in production environment.
karlthepagan,
I did use Executor classes. You still need to create
Runnables though. I’ve heard Jetlang was better than Kilim. I don’t know. I’d like to write these programs in Scala.You don’t “need” that many threads, you just chose the wrong concurrency approach for the Java language.
“I tried to use the minimum amount of resources possible” – not really, you created 10,000 threads. As you’re aware, a Java thread corresponds to an OS thread. A sensible implementation would have created a pooled executor and limited that to a smaller number of threads. Scala does something similar for its actors and its been proven to scale quite well.
“Does it really matter to me as a programmer?” well, yes. You should write to the strengths of language you’re using if you care at all about how you spend your time. If you don’t like that language’s idioms, don’t use the language. To misuse the language only indicts your ability to adapt, not any particular runtime. If I rewrote your Erlang program to be serial, would you still consider it good Erlang? If I wrote an implementation in C++ without using references or pointers, would it be good C++? I don’t think you can argue that to any extent that will convince me.
“I ran them just as I would run in production environment.” I seriously doubt you run 10,000 threads in Java. Doing so is not how the VM is architected to run. If you do that in production on commodity hardware, you shouldn’t.
vijay: you’re using executors to launch threads, not schedule repeated tasks.
The SIGNIFICANT result of the scheduled executor task vs thread solution is that this approach only consumes 8MB of ram (in the N=5000 case) vs hundreds. If you are not aware, this is close to the minimum heap a Java application will allocate.
For another example I was able to construct 50,000 ring members/scheduled tasks in 1.16 seconds and that consumed 20.8MB of ram.
Use a ScheduledExecutorService with number of threads equal to your core count and then schedule repeated tasks like this: pool.scheduleWithFixedDelay(member, 0, 1, TimeUnit.MICROSECONDS); // dirty hack, it doesn’t allow zero delay
Then cancel their tasks at the same time you countDown()
I tried this and on my windows machine (unable to spawn 5000 threads) I can start 5000 scheduled tasks and my stats are relatively performant, but meaningless without a comparison to the threaded approach. It may be that without a better scheduler implemented in ScheduledExecutorService (or hinting at which tasks have work to do) that task performance will not be as great as the threaded approach. This is where MP / Actor frameworks written for the JVM show their merit.
When I get a chance to run all the numbers I’ll make my own post.
You should investigate multiple Java actor implementations like Kilim & Jetlang and not judge based on rumor.
Ah, I see. Erik, You are right, that I don’t need so many threads to accomplish this task. But remember that the problem statement is “Create N threads and pass M messages around. Do the same in another language.” That’s the reason I created N number of threads where N is 10, 100, … 10K. Sure you can accomplish the task using a thread pool but that would be against the intent of the problem statement. I think the problem is meant to show that creating/managing threads in other languages is more expensive than in Erlang.
Just a note, the link to your Erlang solution actually points to the Java one and vice versa.
Pingback: Is Erlang actually fast? Why not just stick with C? « Rants, Rambles, and Rhinos
A few months after you posted this blog entry, a much improved Java thread-ring program was added to the benchmarks game – perhaps you could adapt that for your analysis –
http://shootout.alioth.debian.org/u32/benchmark.php?test=threadring&lang=java&id=4
Thanks for the link Isaac!