Is there a plan to support multicore in the future?

The following extract is from this discussion: http://www.grasshopper3d.com/forum/topics/performance-of-grasshopper?

When it comes to performance in Grasshopper at present getting more cores won't help you much, getting a single fast core is a much better choice. Fast and lots of memory also helps a great deal.

In the near future, the processes that would benefit most from it in Rhino and Grasshopper actually lend themselves remarkably well to multi-threading. Things like Intersections, Meshing, operations on individual items in arrays would all benefit since they involve a lot of repetition where one iteration does not depend on the previous one.

Rhino4 was not designed to be threadsafe, and there were places where it was not possible to thread certain tasks. For example, imagine the Contour command. You'd think that it would be a piece of cake to thread that, you assign the first 25 contour intersections to core 1, the next 25 to core 2, the next 25 to core 3 and so on and so forth. But as it turns out intersecting a Brep and a Plane requires Rhino to build a spatial tree of the Brep first (assuming it doesn't exist yet). These trees vastly speed up a lot of operations and they are created lazily, meaning they get created the first time they are needed. Now we suddenly have four threads all trying to run a Brep Plane intersection and all trying to build the same spatial tree at the same time. This cannot end well. So in Rhino5 we made sure that when the spatial tree is getting build, every other thread that tries to access the Brep gets put on hold until the tree is done.

Then there's problems that the Intersection function might store temporary data on the Brep during the intersection, which makes threading intersections on the same Brep an absolute impossibility.

Then there's the even worse problem that the Intersection function might store temporary data in a static cache, which means you cannot run the function more than once at a time, even if it's on different Breps.

In Rhino5 we tried to rectify all of these problems. I think we got most of them by now.

When Grasshopper switches to Rhino5 for good, we'll start looking into threading a lot more seriously, not in the least because we'll also switch to .NET 4, which has some pretty cool mechanisms for writing decent MT code.

David Rutten

david@mcneel.com

Poprad, Slovakia

Replies to This Discussion

Permalink Reply by Robert Vier on October 21, 2012 at 7:50am

.. may I bring this up another time? I read you are quite busy these times ..

Simple question (I think):

Is parallel computing (however) of an entire graph or parts of it constrained by the same issues like single components, as you described above? I think it is not.

Do you see serious technical problems of taking GH's solver and putting it onto another core/machine with different parameters? Thinking about distributed evolutionary computation.

We did something similar, namely evaluating the network on our own on multiple cores within a component - but quite cumbersome to maintain the solver and cover all possibilites. How difficult would it be to adress GH's native solver either from a component (preferred) or from 'outside'?

Best, Robert

Permalink Reply by David Rutten on October 21, 2012 at 2:14pm

In the case of global variables, it actually becomes impossible to use a certain function more than once at a time, no matter whether that function operates on the same data. So let's assume that the intersector for Breps and Planes uses global variables to cache some data used during the process (it doesn't do that, I'm just using it as an example). Then if you try to compute the intersection between BrepA and PlaneA while in thread #2 an intersection is computed between BrepB and PlaneB, you're totally hosed. Now both functions are writing data to the global variables and they both expect that data to remain unmolested, which isn't happening. Best case scenario, wrong result. Worst case, ultra-mega-crash.

David Rutten

david@mcneel.com

Poprad, Slovakia

Permalink Reply by Robert Vier on October 21, 2012 at 2:38pm

ok thanks!

i think we've been doing it with singletons (?) .. i didnt do that myself.

so, waiting for a MT implementation on rhino5 .. :)

Permalink Reply by dominic on October 21, 2012 at 5:38pm

Maybe things will change with Haswell..... Microsoft apparently gave up on software TM but maybe hardware TM will be viable?

Interesting blog on STM. Maybe his phrase "Isolation first; immutability second; synchronization last" is a good reminder as to how concurrency needs to be approached.

Permalink Reply by Sander Mulders on October 23, 2012 at 4:31am

Hmm this makes me wonder why this example does seem to work. It does a point inclusion test and to me it seems to work fine and a lot faster that single threaded. I used in a number of defintions and has not resulted in the massive-ultra-mega-crash (yet :) )

Attachments:

multiThreading.gh, 8 KB

Permalink Reply by David Rutten on October 23, 2012 at 6:17am

Massive ultra mega crashes are not guaranteed of course, stuff might just work. Then again it might not, in Rhino4 we do not guarantee any method is thread-safe, I do not know our official position for Rhino5 SDK methods. I'll ask at the meeting tonight.

David Rutten

david@mcneel.com

Poprad, Slovakia

Permalink Reply by David Rutten on October 24, 2012 at 3:41am

So basically bad news, if you want to use a specific function then maybe we'll look into it to see if it's thread safe. There has been some work done on locking threads and removing globals, but Rhino5 is only thread safe in certain parts (meshing and intersections for example).

David Rutten

david@mcneel.com

Poprad, Slovakia

Permalink Reply by Sander Mulders on October 24, 2012 at 2:20pm

hmm owkey, for now the adventurous way then :) Just try and see if it works. It is still too interesting too ignore ;)

Permalink Reply by Jonathan Sheridan on October 23, 2012 at 6:29am

I can imagine that addressing multi thread processing needs a clean sheet in any system? It seems that Rhino 5 is supplying this for GH and I wonder if there is a plan to use GH to help design itself for this increased processing power? for example mapping and illuminating where, how and when multi thread is an option or a possibility?

Permalink Reply by David Rutten on October 23, 2012 at 9:33am

To make a fully thread-safe application you will probably have to start from scratch. But that is not our aim. All we want to achieve is that operations that don't modify the data can be called from multiple threads. These are what we call 'const methods'. Basically, we want you to be able to compute the volume of a Brep in two disjoint threads at the same time and get the correct answer in both cases. We don't really want to make it possible to fillet two different edges of the same Brep from two different threads at the same time, that is definitely 'a bridge too far'.

This typically isn't a problem, but Rhino is not a very typical application. There's an incredible amount of hard-core optimization going on and a big part of optimizing code is often building caches which can then be re-used to speed up subsequent operations. Whenever these caches are build, we need to put all competing threads on hold because trying to build the same cache more than once at the same time will result in errors or crashes.

The reason for making Rhino more thread-safe than before is not just Grasshopper, in fact since Grasshopper doesn't use threading this is only a hypothetical need at present. However there are plenty of plug-ins out there (some made by RMA, some not) that would like to speed things up by multi-threading operations, so there already is a definite need for this, with or without Grasshopper.

David Rutten

david@mcneel.com

Poprad, Slovakia

Permalink Reply by dominic on October 24, 2012 at 8:38am

Maybe for Rhino V6? ETA 2018?

Looking at what Mr. Intel suggests coders have to look at, it's no wonder they looking at making locks faster. Shared memory/caches look like a big pain that will never really have a goood solution.

it looks like some geometry libraries like ACIS are already thread-safe. Interesting to read this blog about how ACIS is based on a shared memory, but CGM is not, being not thread safe / reentrant. CGM provides multiple processing via message passing. Maybe Rhino could do something similar...?

Meshing? Sounds similar to Opencascade's experience. Apparently, STM is a comparatively less error prone route, especially when fine grain locking is involved.

In the meantime, I suppose it should be possible to 'multiplex' instructions SIMD-style to 'isolated' or 'embarrassingly parallel' tasks? A lot of this is being driven by GPU computing. I guess if you need to calculate intersections or have massive amount of transforms, GH should be able to harness the GPU.... directly from within the custom component?

What about assembly modeling? What if I load up a 'campus' of buildings, each defined by a separate GH script, slaved to a 'master script' that defines the building's lots? GH being a single threaded in-memory modeler would have to regenerate all of the buildings sequentially?

RSS