XaiJu
entagma
entagma

patreon


Advanced Setups 21 - Jeroen Claus: HDBSCAN

Jeroen Claus covers another important clustering algorithm in the thrid part of his series on popular algorithms used in data analysis and visualisation. In this episode he’ll cover an improvement over DBSCAN, called HDBSCAN. Of course Jeroen goes over implementing it in Houdini using VEX.

Advanced Setups 21 - Jeroen Claus: HDBSCAN

Comments

Great tutorial and a very interesting technique. Thank you for this! I hope advanced setups will continue with stuff regarding data analysis. I would like to share some findings I stumbled upon when trying to implement this technique working on larger datasets ( several 100k - millions of points ): using "int npts[] = nearpoints( 0, "@tree!=" + itoa( i@tree ), v@P, 10, 5 );" seems to greatly decrease in performance exponentially with more points to process. I guess this is due to the group call to exclude points with the same tree value. I got the (visually) same result by looking up couple of hundred points within a large radius and comparing the attributes in a foreach loop, this seemed to improve the performance by magnitudes. This is what my wrangle looks like: https://github.com/MattiBRNDMR/houdini_snippets/blob/main/pcfind_exclude_by_same_attribute Another bottleneck I've found when working with large data sets was the foeach point loop to promote the min distance etc. Rebuilding the foreach point loop with vex further accelerated the process: https://github.com/MattiBRNDMR/houdini_snippets/blob/main/get_min_on_a_dataset

Matti Brandmaier

Fantastic buildup towards advanced understanding.

Van


More Creators