Hello There, Guest! Login or Register (why?)

Post Reply 
 
Thread Rating:
  • 1 Votes - 5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[guide] Optimizations in your C++ Program
02-21-2010, 02:29 PM (This post was last modified: 02-23-2010 06:40 AM by Anman.)
Post: #1
Rainbow [guide] Optimizations in your C++ Program
Hi everyone.
This is my very first real article on anything programming related, so bare with me.

I'm not actually going to go too far into this, just going to bring up some small things that you can do to make your programs more efficient (sometimes slightly, other times greatly).

Now, for those compiling with gcc, it's a pretty smart compiler, and it will most likely assume what you're trying to do, and correct it accordingly. Though the same cannot be said about every compiler, or every version of every compiler; such as the while vs for (I'll get to that in a moment); I think microsoft and borland's compile it literally (optimization, aside).
So it's better to be safe than sorry, especially if you don't know if your code will compiler on various platforms.

while() vs for()
NOTE:My compiler ended up auto-optimizing this, so I couldn't create some assembly code to match my argument (though you can google it up, and you'll find some code generated on other compilers which support this argument), but I can generate some assembly that would look similar, if you want.

There can be abit of controversy when it comes to for vs while for an infinite loop.
In a literal compilation, while() and for() will not do the same thing.
Consider the following two:
Code:
while(1)
  std::cout<<"oh noes!"<<std::endl;
for(;;)
  std::cout<<"oh noes!"<<std::endl;
What the program may compile to in the first example (via pseudo-assembly) is:
cpp Code:
  1. label:
  2. Check if '1'
  3. print "oh noes!" endline
  4. jump to 'label'

Here's the second code:
cpp Code:
  1. <empty statement -- do nothing>
  2. label:
  3. <empty statement -- do nothing>
  4. print "oh noes!" endline
  5. <empty statement -- do nothing>
  6. jump to 'label'

Where you see "<empty statement -- do nothing>", that's actually where the two semi-colons for the for loop are.
So in short..
A while() loop, since it's conditional, will validate to make sure that the condition '1' is true, if so, it will continue the loop, whereas a for() statement would have no statement to condition, so it simply loops.
One fun thing I like to do in my code, is this:
cpp Code:
  1. #define EVER ;;
  2. for(EVER) //muhaha!


Postfix vs Prefix operators
One thing I see alot in code, is the use of 'var++' over '++var'.
Protip: Unless you need the value of 'var' before incrementing it, use pre-incremental.
You may think, "what harm can ++i do?". Honestly, not that much, if it's just an integer.
See, when you use ++i, it increments i, and returns it's value. When you're using i++, it stores the value of i, increments it, and returns it's old value.
In assembly, this is just switching of three commands, but what about with objects?
A great example I found here, is the following:
cpp Code:
  1. class MyClass
  2. {
  3. public:
  4. MyClass& operator++() //Prefix increment operator (++x)
  5. {
  6. //Perform increment operation
  7. return *this;
  8. }
  9.  
  10. MyClass operator++(int) //Postfix increment operator (x++)
  11. {
  12. MyClass temp = *this;
  13. //Perform increment operation
  14. return temp;
  15. }
  16. };

As you see, a new object is initialized, holding the old object, and returns that object.
This means that you're taking a performance shot to the stack every time you use postfix.

Dynamic allocation vs static (for lack of a better term) allocation.
First off, let me define dynamic and static:
cpp Code:
  1. {
  2. //static:
  3. object obj;
  4. int foo;
  5. foo = 0;
  6. obj.setValue(foo);
  7. }
  8. {
  9. //dynamic:
  10. object *obj = new object;
  11. int *foo = new int;
  12. *foo = 0;
  13. obj->setValue(*foo);
  14. delete foo;
  15. delete obj;
  16. }
  17. //Please excuse my dereferencing if it's wrong.

Now, the primary difference between the dynamic and static, is that when you dynamically allocate data, it is saved in the heap; whereas if you statically allocate it, it is saved on the stack.
So for large objects, it's a good idea to dynamically allocate it, to preserve the stack.
Another benefit of dynamically allocated objects, is that they don't have to follow the law of scope, and in a sense, they never go out of scope until they are manually deleted.

Two-Dimensional Array Access
Access wouldn't be the word -- I should say traverse a two-dimensional array.
But the question is, which is better to use?
ary[x][y] or ary[y][x]?

It may not seem like there's any difference, but one is actually much slower than the other.
Here's an example of what the array foo[3][3] would look like in memory (don't question my epic pant skillz ;D):
[Image: 2qdslqv.png]
Putting more into perspective, consider the two:
cpp Code:
  1. for(x=0; x<3; ++x)
  2. for(y=0; y<3; ++y)
  3. foo[x][y] = 0;
  4. for(x=0; x<3; ++x)
  5. for(y=0; y<3; ++y)
  6. foo[y][x] = 0;

Here is what the first five steps of the first for/for loop is doing in memory:
(the red dot represents where in memory the program is pointing to, and the arrow represents where that pointer is pointing to).
[Image: wt8zgo.png]
As you can see, it's using the pointer foo[0] to reach each location of memory that is next to it.
Next, we look at the second part:
[Image: 2ewpgkw.png]

As you can see, the second way, the data pointer is throwing itself all over memory, instead of moving though it one block at a time.
To prove it's length, here is a program I threw together (well, I modified it from the original C program my professor used last year) that creates an array that's 0xF00*0xF00 in size, and traverses it:
http://pastebin.com/f45636265
Here are the results for when I ran it:
x->y: 0.0500
y->x: 0.1000
As expected, the latter was about half as slow.

switch() vs else if
I generated some assembly output for this one (I generated assembly output for the rest as well, but those aren't worth looking at).
One thing I try to do, is use switch() as much as possible, as opposed to else if.
Consider the following two programs:
cpp Code:
  1. int main()
  2. {
  3. int myVar = 0;
  4. int foo;
  5. if(myVar == 1)
  6. ++myVar;
  7. else if(myVar == 2)
  8. --myVar;
  9. else if(myVar == 3)
  10. myVar++;
  11. else if(myVar == 4)
  12. myVar--;
  13. else if(myVar == 5)
  14. --myVar;
  15. else if(myVar == 6)
  16. myVar=0x3;
  17. else if(myVar == 7)
  18. myVar=0xff;
  19. else if(myVar == <img src="http://lastcode.net/images/smilies/CrystalXP/glasses.gif" style="vertical-align: middle;" border="0" alt="Glasses" title="Glasses" />
  20. myVar=0xff;
  21. else if(myVar == 9)
  22. myVar=0x2a;
  23. else if(myVar == 10)
  24. myVar=0x801;
  25. else if(myVar == 11)
  26. myVar=0x6;
  27. else if(myVar == 12)
  28. myVar=0xfa;
  29. else if(myVar == 13)
  30. myVar=0x0354;
  31. else if(myVar == 14)
  32. myVar=myVar+8;
  33. else if(myVar == 15)
  34. myVar=415;
  35. else
  36. myVar=0xF;
  37. foo = myVar;
  38. return 0;
  39. }

cpp Code:
  1. int main()
  2. {
  3. int myVar = 0;
  4. int foo;
  5. switch(myVar)
  6. {
  7. case 1:
  8. ++myVar;
  9. break;
  10. case 2:
  11. --myVar;
  12. break;
  13. case 3:
  14. myVar++;
  15. break;
  16. case 4:
  17. myVar--;
  18. break;
  19. case 5:
  20. --myVar;
  21. break;
  22. case 6:
  23. myVar=0x3;
  24. break;
  25. case 7:
  26. myVar=0xff;
  27. break;
  28. case 8:
  29. myVar=0xff;
  30. break;
  31. case 9:
  32. myVar=0x2a;
  33. break;
  34. case 10:
  35. myVar=0x801;
  36. break;
  37. case 11:
  38. myVar=0x6;
  39. break;
  40. case 12:
  41. myVar=0xfa;
  42. break;
  43. case 13:
  44. myVar=0x0354;
  45. break;
  46. case 14:
  47. myVar=myVar+8;
  48. break;
  49. case 15:
  50. myVar=415;
  51. break;
  52. }
  53. foo = myVar;
  54. return 0;
  55. }

The brackets are left out, as it would make the else if too long.
Now, when this is compiled, the first program would end up compiling into a mess of cmp and jmp commands, one-by-one, working its way down.
If myVar happened to be 15, it would end up doing 15 conditionals. The worst-case scenario here, would be that myVar isn't a number listed, and the result would be a test of every possible condition, until it reaches the else.

As for the switch(), that actually manages to be pretty efficient (well... VERY efficient).
This is because the second will compile into jump tables -- or hash maps. This means that there is only one test done, and it knows which to jump to.
You can find the assembly output for the first program here: http://pastebin.com/f511c7140 and the second here: http://pastebin.com/f654d9226

We now have a situation though. And that's the question of "what should we do with strings?".
I ran into this exact problem when I was writing my virtual machine, well, more specifically, my assembler.
In my first version, I was using a mess of else if and strcmp; here's a snippet of my parser, so you can see what I mean:
cpp Code:
  1. char* first = new char[256];
  2. //Get first 4 characters
  3. while(line[j]!='\0' && line[j]!=' ' && line[j] != '\n' && line[j]!=';' && line[j]!='\r')
  4. first[j] = line[j++]; //Get instruction
  5. first[j++]='\0';
  6. /*Flow*/
  7. if(!(strcmp("jmp", first)))
  8. {
  9. if(jumps(line, second, j, result, table, size))
  10. {
  11. result.dword = 1;
  12. return result;
  13. }
  14. else
  15. result.byte[3] = 0x20;//Set the highest byte
  16. }
  17. else if(!(strcmp("jz", first)))
  18. {
  19. if(jumps(line, second, j, result, table, size))
  20. {
  21. result.dword = 1;
  22. return result;
  23. }
  24. else
  25. result.byte[3] = 0x24;//Set the highest byte
  26. }
  27. else if(!(strcmp("jg", first)))
  28. //etc

This honestly was killing me!
The reason for this issue, is because a string (whether it's the std string, or a char*) is pretty much an array of characters, and you can't treat an array as one value.

My ultimate solution, was to hash the string, throw the solution keys into an enumeration, and run a switch() on the key -- it greatly improved the efficiency, and the num helped make things more sensible.

There are other solutions though; one, is to use nested switch() statements -- they have been used professionally before, but at the cost of making your wonderful code look horrible, also, the efficiency drops the more nested switches you use; here's an example that does something for the following strings: "jmp", "je", "cat"
cpp Code:
  1. switch(str[0])
  2. {
  3. case 'j':
  4. switch(str[1])
  5. {
  6. case 'm':
  7. //do stuff for jmp
  8. break;
  9. case 'e':
  10. //do stuff for je
  11. break;
  12. }
  13. break;
  14. case 'c':
  15. switch(str[1])
  16. {
  17. case 'a':
  18. //do stuff for "cat"
  19. break;
  20. }
  21. break;
  22. }


The second, is to use the STL Container classes, and use the strings as the key, and just have the entries point to functions like in a hash table.

Other than that, I can't really think of any other ways of doing this.

Inline vs Non-Inline
Just a quickie on here.
If you call a certain function many times (esepcially a small function, such as void "addXY(int& x, int& y, int& z) {z = x+y;}", you may want to consider making it inline.
Use inline sparingly though, because it doesn't always result in faster code, and usually results in longer code

Preprocessors
Another controversy is whether to use #define (and macros) or use typedef (or functions).
Since the preprocessors are made during compile-time rather than run-time, they will be faster, but I personally use typedef over #define due to the fact I feel it's safer to work to work with type-safe objects then constants (you can fund much discussion on this subject).

Compiler Flags
For those using GCC, there are afew flags you should be aware of when compiling your code.
First, the -O (not -o) flag, you can choose the level of compiler-size optimization for your code:
-O0
This option turns all optimizations off.
-O1
This option will try to reduce the code's size and efficiency while trying not to slow down the compiling process.
-O2
Level two optimization will enable as just about all the compiler optimizations that shouldn't hurt your program. This one may show down the actual time it will take to compile, but it's worth it.
I'd always recommend -O2
-O3
This turns all the optimizations on; there's a chance it will break your code though, I don't touch it.
-mtune and -march
mtune will tune your program to your CPU type, and march will generate assembly for your specific CPU (it will assume that mtune=march).
If you don't mind your program only working on your computer, you can use:
-march=native
Which will detect your CPU type, and act accordingly.
If you want your binary to execute on any machine (well, any x86), you can use -march=generic (or -mtune=generic); it will use all of those slow legacy instructions for the 8086 Surprised
You can see the rest of the -m optimizations here:
http://gcc.gnu.org/onlinedocs/gcc/i386-a...tions.html



Well, I hope this helped someone -- I enjoyed writing it atleat Tongue
Find all posts by this user
Quote this message in a reply
02-21-2010, 02:34 PM
Post: #2
RE: [guide] Optimizations in your C++ Program
This is all over my head, but it looks awesome and I can't thank you enough for contributing. I really like this site even though I'm new here and hope to learn a lot and people like you will make that possible! Thanks. Smile
Visit this user's website Find all posts by this user
Quote this message in a reply
02-21-2010, 02:36 PM
Post: #3
Wink RE: [guide] Optimizations in your C++ Program
(02-21-2010 02:34 PM)Uskdara Wrote:  This is all over my head, but it looks awesome and I can't thank you enough for contributing. I really like this site even though I'm new here and hope to learn a lot and people like you will make that possible! Thanks. Smile
Thanks!
C++ isn't too hard once you get the hang of it; it's actually one of the easier languages.
Find all posts by this user
Quote this message in a reply
02-21-2010, 02:41 PM
Post: #4
RE: [guide] Optimizations in your C++ Program
Yeah, I'm not completely illiterate, but I'm also far from experienced. I was just getting into object stuff, just got pass classes and functions in my college course when I had to drop out for financial reasons. (My Dad's net worth is too much for me to qualify for financial aid even though his business is in the negative, kind of irritating but oh well Tongue I'll keep saving up and earn my own way back)
Visit this user's website Find all posts by this user
Quote this message in a reply
02-21-2010, 10:39 PM
Post: #5
RE: [guide] Optimizations in your C++ Program
Thanks for contributing! Tongue

Argh! The knowledge, my head *explodes*.
Jk, now I know why there is the option of ++i or i++; xD
Other than using it like:
Code:
int a = i++;
int b = ++i;
std::cout << a << ", " << b << std::endl;
And I should start using switch() more. =/

[Image: mickcy.png]
Visit this user's website Find all posts by this user
Quote this message in a reply
02-23-2010, 06:41 AM (This post was last modified: 02-23-2010 06:41 AM by Anman.)
Post: #6
RE: [guide] Optimizations in your C++ Program
Thanks for contributing such a well thought out and detailed tutorial! I'm sure this will help out a bunch of beginners.

HTML-Tricks
Find all posts by this user
Quote this message in a reply
Post Reply 


Forum Jump:


Contact UsLastCode.netReturn to TopReturn to ContentLite ModeRSS