Hey HC and OCR folks,
I have run HPC toolkit for OCR for a workshop paper submission before but I did not look deep into where our overheads might be (function pointers: an anticipated scapegoat).
I am planning to look into some schedulers and I thought I should run some tests to see how far OCR is from HC, to decide which one to start working on. I know this is not how would compare the two, cholesky may not be the benchmark for it, and I do not want to jump to hasty conclusions, but I am pleasantly surprised with the results seen in the attachment and I will look further into why they turned out this way.
Zoran, I recall you were looking into a performance bug on a benchmark where Sanjay found some unexpected slowdown. I wonder this has anything to do with that.
Vincent, I believe you looked into caching the function pointers for some classes and make them static variables, I wonder if we should look into the impact of that change.
Romain/Vincent, how much of this do think we may attribute to the allocators? I am not sure what allocator is used where anymore, I suspect it may even be malloc for OCR and a thread-safe malloc for HC
TL;DR: OCR seems to outperform HC for a benchmark in a given environment even though we did not tune it, and it has function pointers everywhere.
p.s. results collected on a 2 socket Xeon E5-2690s (8 cores each) with gcc47 with -O3, boost 1.49, ocr master branch, hc trunk branch,
Just to see if it could be done, I tried to build OCR on my Mac today. Turns out it's pretty easy. You need to install the following packages from one of the distributions that ports Linux stuff to OSX:
I used the "brew" port manager. It seems to be more robust than some of the other ones.
If you use brew, it installs libxml2 in an unusual place so as to not conflict with the different version of libxml2-2 that OSX installs, so you have to set some shell variables to get OCR to build:
With those things installed/configured, the install.sh script ran just fine, and I was able to run the treesum examples. Hopefully, that means it just works :)