Understanding performance loss due to high number of nanosleep calls on mOS
by Avinash Maurya (RIT Student)
Hi team,
We were evaluating the performance of Apache Spark on mOS and were
experiencing slowdown on LWK. Since Spark uses Java for execution, I wrote
a small Java program to trace the reason behind this.
I used the strace utility to inspect the behavior of each thread of the
program and found a very large number of nanosleep calls on LWK. For
instance, in a particular execution, when the number of calls to
nanosleep({tv_sec=0,
tv_nsec=1000000}, NULL) was 3694 on Linux, the LWK execution
contained 33124 calls, which led the LWK execution to complete in 1m43.721s
against Linux's 1m5.871s execution time. Adding up the relative timestamps
of all nanosleeps, it contributed around 30 additional seconds to LWK
execution, thus being the major reason for slowdown of performance. Please
find the logs and the program attached.
I am not sure what can be done to improve the performance of such
Java-based programs (I tried different versions and vendors of Java and
different programs). It would be really very helpful if you could please
point me in a direction to look for debugging this. Also, I would really
appreciate it if you could share any Java-based multi-threaded experiments
that have performed better on LWK as compared to Linux.
Thanks,
Avinash
9 months, 1 week