søndag den 12. december 2010

Smaller arrays

With arrays sizes 1 to 10000:



and with the ratios truncated

How to do housecleaning

[i'll upload the code later]

Housecleaning I say! Time to get rid of those array elements that have been laying around since last winter (1 ms ago).

Deleting elements from an array requires a bit of delicacy if we want to do it in reasonable speed. I found two (new to me) methods of cleaning the arrays. The PopBack and the Defrag methods.

PopBack overwrites the element-to-be-deleted with the last element and then decrement the number of elements (or move 'end' pointer). PopBack destroys the order in the array, but can otherwise be used at any point in time. PopBack here accesses elements in random order.

With Defrag the goal is to partition the array into 'live' and 'die' elements. The 'live' elements are put in the beginning of the array. The element count is then decremented (or 'end' moved). This is done by accumulating a list of elements to die (aka death-row). We then re-assign the original array to itself while skipping elements in the death-row. The elements in death-row needs to be in linear order, so the method is only practically valuable if death-row does not need sorting.

I've made an experiment with the two methods, with Defrag working with a sorted death-row. Test parameters are array size and percentage of the array to be deleted. The deleted elements are picked at random. Results for my 3.73 GHz Xeon cpu:



In the bottom left dark blue region PopBack is faster than Defrag. For smallest arrays (~1000 elements) it's only after deleting more than 40% of the elements that Defrag becomes faster than PopBack.

Whenever we want to delete a high percentage, or have a large array, then Defrag becomes faster. This is very likely due to Defrag accesses two blocks of memory in linear order, while PopBack needs random access.

I'll redo the experiment of smaller array, in higher detail, soon :)

tirsdag den 7. december 2010

refactoring refactoring refactoring

and ending up with your descriptor class doesn't need to include anything. And everything* is sane and readable. Oh joy!

*) i wrote.

lørdag den 27. november 2010

touchy feely

When plowing through other peoples source i often find myself yelling and screaming that things are a complete chaos and mostly meaningless. While this is obviously partially true, it's definitely not true for the person that wrote the code originally as he/she have it all mapped out in their heads.

[goes to refill his drink]

One particular case where my minds says "change this" is mappings embedded in control logic. Case:

/* function was defined 100 lines above */
if( Type == Banana )
{
/* do Banana stuff */
}
else if( Type == Orange )
{
/* do Orange stuff */
}
else if( Type == Potato )
{
/* do Potato stuff */
}
else
{
/* issue "unsupported type" error and bail */
}


While the purpose might be perfectly understandable with real-world code, the reader has the problem that execution isn't directly stopped when the right type is found. This leads to two things: Execution of (minor) irrelevant code, and possible ambiguity in which if/else statements have been executed. (the branches might not be mutually exclusive).

Anyways, my strategy is to resolution in a separate function and keep execution in the control logic completely linear:

/* function was defined 100 lines above */
bool bOk = HandleVegetable( Type );
if( !bOk )
{
/* return error code */
}

with

bool HandleVegetable( VegetableType Type )
{
if( Type == Banana )
{
/* do Banana stuff */
return true;
}
if( Type == Orange )
{
/* do Orange stuff */
return true;
}
if( Type == Potato )
{
/* do Potato stuff */
return true;
}
return false;
}

in this way we're guaranteed type vs. execution uniqueness and as a consequence, a lot less headache in finding out which execution paths have been touched.

or i'm a newb.

torsdag den 25. november 2010

Internal protection

Yos all

I got a idea yesterday on how to guarantee that no member functions writes (or reads) to a given member variable.

Some c++ class end up being silly long (4000+ line) and no-one really have any idea of what's going on. Especially if the given class also derives from another class, is inherited by a third and used an aggregate in a fourth class. A big mess basically.

In such scenarios, you could easily be given the task of extending the class with a new type of subsystem X. I guess that any excuse could also be used.

Idea:

class BigClass
{
public:
// lotsa stuff
private:
bool m_StateOfX;
};

if we want to guarantee that the member function OnlyForX() is the only function that writes to m_StateOfX, when we can do the following (and accept some overhead):


class BigClass
{ /* bla */
private:
struct StrictX
{
public:
bool Get();
private:
bool m_StateOfX;
friend void BigClass::OnlyForX();
} m_StateOfX;
void OnlyForX();
}

if we further want to punish ourselves with templates we can also do that:

template
class Strict
{
public:
const T& Get() const { return m_Val; }
protected:
T m_Val;
};

class BigClass
{
class StrictX : public Strict { friend BigClass::OnlyForX(); };
void OnlyForX();
};

there you go! it might even compile..