New version of python-crush with descent sanity checks during argument parsing. It makes for much better error messages.

#python #crush

Implementing better argument consistency warnings and errors for python-crush. Tedious but necessary.

#python #usability #crush

Trying to understand why ln(hash(x)) / weight leads to a better distribution than hash(x) * weight . It is more complicated but why ?

#math #crush

This morning puzzle: avoid accumulating rounding errors when fixing crush weights. The goal is to do something like weight * correction intsead of weight * correction1 * correction2 * correction3 * ...

But it's easier said than done, specially when mixing integers, Q16.16 and doubles.



First draft of the crush algorithm implementation in C, using straws and moving exactly one item at a time. It should be at least an order of magnitude faster than the current python-crush implementation.


It feels good to have two new users of python-crush. Even better that they find corner cases that can be quickly fixed.


I fixed just one bug today. And managed to do it wrong five times.

#crush #looser

First successful rebalancing of a live Ceph cluster with python-crush optimize. First bug report as well. 😅

#crush #ceph

I got an idea about CRUSH optimization. When we know all the values to be distributed, rebalancing an uneven distribution should not be an optimization problem. We should be able to calculate exactly which weights lead to a perfect distribution.

Running a simulation with all known values at each step of the optimization is going in the right direction but is unnecessarily complicated.

I'll modify CRUSH to expose the values influenced by weights and see where it leads us.


Blog post explaining how python-crush optimize can be used to rebalance a Ceph cluster.

#ceph #crush

Proposed a strategy to handle OSD removal / addition when weight sets exist. Implementing it now. It could be really simple: all zero 😬

#ceph #crush

Rewriting crush optimization tests for better and faster coverage of the logic. I hope to go from a 30 minutes runtime to one minute or two. The main optimization function is still too complex but I can't figure out how to break it down.

#crush #tests

crush optimization published, at last ! One person agreed to test it on a live cluster. Fingers crossed.


Finished rewriting python-crush to use Q16.16 internally instead of floats. The optimization is stable and passes tests. Now to documentation !


When you spend hours implementing a fix, run the tests and discover that it does not fix anything after all.

#solitude #crush

Reworking python-crush to *not* use float internally. They are good for people to read. But the rounding errors when converting to Q16.16 (which is what crush uses) are confusing and wrong. Even more so because they are rare.


Crafted a simple reproducer for the loss of precision in the crush compile/decompile loop.

#crush #floatingpingpong

I'm loosing precision *somewhere* when converting from float to 16.16 fixed point. This is going to be fun to debug...

#python #crush

The python-crush optimize command can now be given a Ceph report. It's definitely the easiest way to optimize a crushmap. Not only because it can read the PG num, pool size etc. from the OSDMap. But also because it can verify that all mappings in the OSDMap match with the python-crush generated mappings.

I now need to complete the last (?) piece of the puzzle which is incremental optimization.


Which distributed systems would benefit from a better placement algorithm ? Except for storage and caching, these are covered :-) I'm looking for contexts where CRUSH would be useful.

#crush #distributedsystems