Tuesday, June 2, 2009
Line of code of the day #2
Who would guess its another LOCOTD, this time we have some C code to show...
solution[batch] = solution[batch] | ~((polar_0[batch] ^ polar_1[batch])&polar_0[batch]) | ((polar_0[batch] ^ polar_1[batch])&polar_1[batch]);
In case you're wondering ... yes i could load the values of polar_0 and polar_1 on index batch, but since i don't know what kind of effect that will have on performance, i'll try both and see... compilers and assemblers do magic sometimes... :p
I leave the moment to show you some magic ... long live to SWAR Algorithms!
Jokes aside, using general CPU instructions to make some crazy highly efficient operations is always good, while it is wise to check such approaches also keep moderation in mind, you can end up with some brain damage...
Now... a little puzzle for anyone who is willing to take it:
I have an array of 32-bit masks, that in reality represents a very huge mask for a very huge matrix, now ... i want to grab those bits masked (the number of masked bits is known) and pack them all in another array, all of that must be done fast, since speed is what we want. I leave a little example for 8-bit masks, 4 length array:
mask 01111000 01001001 10101110 10111100
masked 01101011 10100101 11101110 11010010
result 00000000 00000001 10100111 11110100
I do not have a (eficient) solution for this problem (yet - i kinda have an idea), i hope to find one since this is kinda crucial to optimize the speed of the rest of the algorithm.
With that said, feel free to comment and give sugestions about it...
Good coding...
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment