algorithmic modeling for Rhino
I'm trying to do some performance tweaking on a model. It does a lot of distance measurements and I was trying to work out if that was a bottleneck. Back in the very distant past when I was learning RhinoScript I was told that calling rs.distance was the kiss of death to performance. I thought that I'd check that this was still the case.
I started with a little cube of points. I have 4 methods of measuring distance, RhinoScript, maths, ghcomp and ghcomp parallel. (I could put scipy and chums in too, but I haven't got around to wiring it into Grasshopper.)
All 4 run in the same Python component so that they have the same import headers.
The results are a bit counter intuitive, so I suspect that I'm doing something a bit wrong!
Something else strange is that the two GH methods get slower as you re-run the component.
Here's a Gist, and here's the code:
import rhinoscriptsyntax as rs
import Rhino.Geometry as rg
#from clr import AddReference as addr
#addr("Grasshopper")
#from System import Object
#from Grasshopper import DataTree
#from Grasshopper.Kernel.Data import GH_Path
import random
import math
from datetime import datetime
import ghpythonlib.components as ghcomp
import ghpythonlib.parallel
cloudSize = len(pointCloud)
#Rhinoscript
startTime = datetime.now()
for i in range(iterations):
p1 = pointCloud[random.randint(0,cloudSize-1)]
p2 = pointCloud[random.randint(0,cloudSize-1)]
throwawayVariable = rs.Distance(p1,p2)
endTime = datetime.now()
Rhinoscript = endTime - startTime
#Grasshopper
startTime = datetime.now()
for i in range(iterations):
p1 = pointCloud[random.randint(0,cloudSize-1)]
p2 = pointCloud[random.randint(0,cloudSize-1)]
throwawayVariable = ghcomp.Distance(p1,p2)
endTime = datetime.now()
Grasshopper = endTime - startTime
#GHparallel
startTime = datetime.now()
p1 = []
p2 = []
for i in range(iterations):
p1.append(pointCloud[random.randint(0,cloudSize-1)])
p2.append(pointCloud[random.randint(0,cloudSize-1)])
pairs = zip(p1,p2)
throwawayVariable = ghpythonlib.parallel.run(ghcomp.Distance, pairs, True)
endTime = datetime.now()
GHparallel = endTime - startTime
p1 = None
p2 = None
#native
startTime = datetime.now()
for i in range(iterations):
p1 = pointCloud[random.randint(0,cloudSize-1)]
p2 = pointCloud[random.randint(0,cloudSize-1)]
throwawayVariable = math.sqrt( math.pow(p1.X-p2.X, 2) + math.pow(p1.Y-p2.Y, 2) + math.pow(p1.Z-p2.Z, 2))
endTime = datetime.now()
native = endTime - startTime
What's the general feeling about this?
Tags:
Computing the distance between two points is a very minimal operation, so anything which introduces overhead into this will have a significant effect on overall performance. The Distance method in Rhino core and RhinoCommon has a little overhead as it deals with some very rare special cases (basically when the points are very close together) that tend to give wrong results with the naive approach. If you know your points aren't like that, you can use your own distance function and it will be somewhat faster.
Using GH components to compute distances introduces a lot of overhead compared to the simplicity of computing distances.
Finally, if you only care about comparing distances, you can omit the square-root, as the squares of two positive numbers have the same smaller-than/larger-than relationship as the two numbers. Thus, the fastest you can make this is:
distance = (p1.X-p2.X)*(p1.X-p2.X)+(p1.Y-p2.Y)*(p1.Y-p2.Y)+(p1.Z-p2.Z)*(p1.Z-p2.Z)
However, it is quite often the case that you should optimize the code not by speeding up the individual operations, but by reducing the number of times you have to call these operations.
--
David Rutten
david@mcneel.com
--
David Rutten
david@mcneel.com
Hi Ben,
(this is in addition to what David wrote)
If you are running the brute-force approach with all-points-with-all-points, for a sufficiently large amount of points, methods that are lucky enough to be able to reduce the complexity of the operation down from pt2 will outperform any type of micro-optimization. There's RhinoCommon's RTree tree to help reduce complexity in several proximity cases.
Giulio
--
Giulio Piacentino
for Robert McNeel & Associates
giulio@mcneel.com
Giulio, David,
I was thinking about this while I was going to sleep and came to the same conclusion. In my use case the distance calc isn't something I can really batch to a more efficient method, I was just trying to find the bottleneck and when I came across this result it seemed counter intuitive.
I really ought to write each of them as a function and try them on the parallel call, but I'd imagine that the results would be the same.
The main thing I was impressed by was how little the rs.distance and maths method of getting the distance differed. That seems to have changed a lot since the steam-age.
Sorry for confusing things with the example, I just wanted to make sure that there wasn't any compiler optimisation going on if I measured the distance between the same two points a bunch of times.
Ben
Some more info related to python parallel performance that may be relevant to these results:
http://discourse.mcneel.com/t/ghpython-parallel-component-slower-th...
http://discourse.mcneel.com/t/new-ghpython-issues-crashing-multithr...
Welcome to
Grasshopper
Added by Parametric House 0 Comments 0 Likes
Added by Parametric House 0 Comments 0 Likes
Added by Parametric House 0 Comments 0 Likes
Added by Parametric House 0 Comments 0 Likes
Added by Parametric House 0 Comments 0 Likes
© 2024 Created by Scott Davidson. Powered by