Weee... another LOCOTD ... (no ... that doesn't mean Crazy Tower Defence) ... today i present here the following line of code:
solution[i] = solution[j] | (~0)&(1<<z)
The most familiarized with C and pointers will notice that if i is not related in bounds with j value (or vice-versa) something really bad will happen with memory access/write... the lovely usual segmentation fault or worse... the line should have look like this:
solution[j] = solution[j] | (~0)&(1<<z)
Or in a more compact form:
solution[j] |= (~0)&(1<<z)
I lost 3h to find where the error on my code was... you see, when instead of j the i was there I was writing in some area, that area was where a pointer inside a struct was, at first I though the error was in functions that manipulate variables of that struct... when I noticed something very strange when I switch some entries on struct... it looked like something impossible to happen, but after I've done it some stuff worked, so struct control looked fine... only answer would me that I was modifying struct data unknowing it... after a bunch of printf's (like... everywhere) I tracked the error down... cost me a lot of time... and now I 'only' have to finish some algorithmic stuff that is implemented, polish some of the code and optimize some memory accesses...
Well... good coding... i hope you have better luck than me...
Showing posts with label C. Show all posts
Showing posts with label C. Show all posts
Thursday, June 18, 2009
Tuesday, June 9, 2009
Line of code of the day #3
So... remember that little problem i had to 'compact' selected bits? i think i solved it...
By using a very special look-up table i manage to do the trick using this 'easy' to read line:
(( f[(m & 0xF0) | (val & 0xF0) >> 4] >> 3) >> (f[(m & 0x0F)<< 4 | (val & 0x0f )] >> 3 ) & 0xF)
Oh, this code is just for 8bits, a simple expansion allows me to use it for 32bits (with look-up table taking up 2Mb)... and no... I won't explain what is in f table has... and yes... I might screw up with the parentesis... I didn't compile that piece of code, but tested it manually instead... it should work... efficiency wise it's better that scan all bits one by one...
You can now commit suicide.
PS: By reading this you agree that i'm not responsible for any harm occoured both physically or mentally when reading the text above... really...
(( f[(m & 0xF0) | (val & 0xF0) >> 4] >> 3) >> (f[(m & 0x0F)<< 4 | (val & 0x0f )] >> 3 ) & 0xF)
Oh, this code is just for 8bits, a simple expansion allows me to use it for 32bits (with look-up table taking up 2Mb)... and no... I won't explain what is in f table has... and yes... I might screw up with the parentesis... I didn't compile that piece of code, but tested it manually instead... it should work... efficiency wise it's better that scan all bits one by one...
You can now commit suicide.
PS: By reading this you agree that i'm not responsible for any harm occoured both physically or mentally when reading the text above... really...
Tuesday, June 2, 2009
Line of code of the day #2
Who would guess its another LOCOTD, this time we have some C code to show...
solution[batch] = solution[batch] | ~((polar_0[batch] ^ polar_1[batch])&polar_0[batch]) | ((polar_0[batch] ^ polar_1[batch])&polar_1[batch]);
In case you're wondering ... yes i could load the values of polar_0 and polar_1 on index batch, but since i don't know what kind of effect that will have on performance, i'll try both and see... compilers and assemblers do magic sometimes... :p
I leave the moment to show you some magic ... long live to SWAR Algorithms!
Jokes aside, using general CPU instructions to make some crazy highly efficient operations is always good, while it is wise to check such approaches also keep moderation in mind, you can end up with some brain damage...
Now... a little puzzle for anyone who is willing to take it:
I have an array of 32-bit masks, that in reality represents a very huge mask for a very huge matrix, now ... i want to grab those bits masked (the number of masked bits is known) and pack them all in another array, all of that must be done fast, since speed is what we want. I leave a little example for 8-bit masks, 4 length array:
mask 01111000 01001001 10101110 10111100
masked 01101011 10100101 11101110 11010010
result 00000000 00000001 10100111 11110100
I do not have a (eficient) solution for this problem (yet - i kinda have an idea), i hope to find one since this is kinda crucial to optimize the speed of the rest of the algorithm.
With that said, feel free to comment and give sugestions about it...
Good coding...
Labels:
C,
english,
informative,
links,
LOCOTD,
programming,
SWAR
Subscribe to:
Posts (Atom)