5.1 Valgrind for CS241
If you are using C++ in this course, Marmoset will run your submissions with Valgrind. Valgrind is a program that detects memory-related errors.
A common misconception is that Valgrind only detects memory leaks, and that if you don’t use new in your program you shouldn’t get Valgrind errors. Valgrind actually detects a number of memory errors other than leaks, such as uses or accesses of uninitialized memory. Additionally, there are ways you can leak memory even if you don’t use new, such as if your program terminates improperly and is unable to clean up stack-allocated objects. Improper termination can be caused by uncaught exceptions or by using the exit function (which should not be used in C++).
Valgrind error messages can be quite long and intimidating. This guide is intended to give you an idea of how to handle these errors.
5.1.1 General Tips
Solve the first error
When confronted with a massive Valgrind report consisting of many errors, a good idea is to start by just solving the first error. Often memory errors compound, and one error will cause many other errors throughout the program. Solving the first error that Valgrind shows you will sometimes fix many other errors, possibly even all the errors. Pretend the error message consists only of the first error, and ignore everything else.
Marmoset is actually set up to only show you the first error that Valgrind detects for this reason. However, when running Valgrind on your own, you might see many errors.
Look for function names and line numbers
If you compile your program with the -g flag, Valgrind will show you the function names and line numbers where errors occur. Sometimes the actual bug occurs on a different line (particularly for uninitialized value errors) but the line number Valgrind tells you is a good starting point.
For example, in this message, there was a use of an uninitialized value in the main function on line 6 of the program. The bug is probably not on line 6 itself but rather earlier in the program where the value was left uninitialized. However, looking at line 6 can give you an idea of which value might have been left uninitialized.
==98641== Conditional jump or move depends on uninitialised value(s) |
==98641== at 0x1091F3: std::vector, std::allocator >, std::allocator, std::allocator > > >::resize(unsigned long) (stl_vector.h:691) |
==98641== by 0x109016: main (program.cc:6) |
Look for the last point in the stack trace where your program appears
Consider the following error:
==51205== Invalid read of size 8 |
==51205== at 0x4F7B905: assign (basic_string.h:1439) |
==51205== by 0x4F7B905: std::__cxx11::basic_string, std::allocator >::operator=(char const*) (basic_string.h:705) |
==51205== by 0x108A3D: h() (program.cc:6) |
==51205== by 0x108A8A: g() (program.cc:9) |
==51205== by 0x108A96: f() (program.cc:11) |
==51205== by 0x108AA2: main (program.cc:14) |
==51205== Address 0x8 is not stack'd, malloc'd or (recently) free'd |
Did the error happen on line 6, line 9, line 11, or line 14 of program.cc? Or did it happen somewhere in the C++ standard library?
Unless there’s a bug in the C++ standard library, the error probably happened in your program. Furthermore, the error probably happened at the last point in the stack trace where your program appears.
In this case, we see that main called function f, then function f called function g, then function g called function h, and then function h did something with an assignment operator, which lead to an invalid read error. The problem is likely in function h, although you might still want to look elsewhere if you can’t find any problems in h.
5.1.2 Common types of Valgrind errors
Invalid reads and invalid writes
Invalid read errors and invalid write errors occur when you try to read from or write to a part of memory that you shouldn’t be accessing. A very common reason for this is if you try to access an element of a vector or other data structure that doesn’t exist. For example, if you access an index that is past the end of a vector, you will likely get one of these errors.
Valgrind will tell you the line where the invalid read or write occured, and usually there will be some code that accesses a vector or other data structure on that line. Think about whether this access is always valid. Could there be a case where you reach this line without adding the required elements to the data structure? The invalid access might only occur in certain cases, such as when the input contains a blank line.
Uninitialized value errors
The error message "Conditional jump or move depends on uninitialized value(s)" essentially means Valgrind has determined that the result of your program depends on uninitialized memory. Sometimes you will also see the message "Use of uninitialized value of size N".
Valgrind will report the line at which the program depends on the uninitialized value. It will allow uninitialized values to be moved and copied around in memory without reporting an error, as long as the program doesn’t depend on these values. This can make the error hard to find because the mistake could be far away from the line Valgrind reports. For example, consider the following program:
1 #include |
2 #include |
3 int main() { |
4 int i; |
5 i += 1; |
6 int j = i+2; |
7 std::vector v {i,j}; |
8 v.push_back(i+j); |
9 for(int i : v) { |
10 std::cout << i << std::endl; |
11 } |
12 } |
Valgrind will report errors on line 10. The actual problem is on line 4, where we forgot to assign a value to i. But Valgrind allows us to increment i, assign i+2 to a new variable j, create a vector containing i and j, and add i+j to the vector, all without complaints, because the visible behaviour of the program isn’t actually affected until we try to output the values of the vector.
The line number that Valgrind tells you is still helpful, because you know that somewhere on that line you’re using an uninitialized value. But you might have to do some detective work to figure out which value is uninitialized and why it was not initialized. It’s not always as simple as forgetting to give a default value to a variable. Maybe your program reads data from standard input, and there is a bug in the input reading function that causes some of the data variables to be uninitialized in certain cases.
Memory leaks
Sometimes Valgrind will report that your program leaked memory. There are a few reasons this can happen.
Forgetting to deallocate things you allocated
This is the most obvious and easily avoidable reason for memory leaks, but sometimes these mistakes happen. If you are using new, did you call delete on everything you allocated? Did you use the correct type of delete? ( If you allocate an array you need to use delete [] instead of delete.)
You can avoid these problems by using smart pointers instead of new and delete, although smart pointers come with their own difficulties.
Improper termination
If you aren’t using new, you might be confused as to how you can possibly be leaking memory. STL classes like vector and map use new internally, but they are designed to clean up their allocated memory correctly when their destructors are called. If you terminate your program improperly though, their destructors might not be called, and then memory will be leaked.
A common reason for this is using the exit function. This function is seemingly useful for terminating the program at an arbitrary point, but it has a catch: it doesn’t call the destructors of stack-allocated objects before exiting. For this reason, you should avoid this function in C++. Instead, throw an exception from the point you where want to exit, catch the exception in your main function, and then return from main normally. If your whole program is in main, you can also just use return statements instead of exceptions; returning from main will do the proper cleanup. However, using exceptions and a single return point in main is arguably cleaner than having multiple return points in main.
Uncaught exceptions
Another type of improper termination is throwing an exception and not catching it. The Valgrind error message will look something like this if you are running on Valgrind your own (with the way Valgrind is configured on Marmoset it won’t show a message like this, but Marmoset tries to detect that an uncaught exception occurred and inform you.)
terminate called after throwing an instance of 'std::out_of_range' |
what(): vector::_M_range_check: __n (which is 0) >= this->size() (which is 0) |
==119257== |
==119257== Process terminating with default action of signal 6 (SIGABRT) |
==119257== at 0x5472E97: raise (raise.c:51) |
==119257== by 0x5474800: abort (abort.c:79) |
==119257== by 0x4ED587D: __gnu_cxx::__verbose_terminate_handler() [clone .cold] (vterminate.cc:95) |
==119257== by 0x4EE1485: __cxxabiv1::__terminate(void (*)()) (eh_terminate.cc:48) |
==119257== by 0x4EE14F0: std::terminate() (eh_terminate.cc:58) |
==119257== by 0x4EE1744: __cxa_throw (eh_throw.cc:95) |
==119257== by 0x4ED8036: std::__throw_out_of_range_fmt(char const*, ...) [clone .cold] (functexcept.cc:96) |
==119257== by 0x108BEB: std::vector >::_M_range_check(unsigned long) const (stl_vector.h:825) |
==119257== by 0x108AC8: std::vector >::at(unsigned long) (stl_vector.h:846) |
==119257== by 0x10899E: main (exception.cc:5) |
Notice that "throw" appears several times in the error message, indicating the error is related to throwing an exception. The message "Process terminating with default action of signal 6 (SIGABRT)" is also a telltale sign of an uncaught exception, because uncaught exceptions will cause the program to "abort".
Usually when you get this error, it’s not because of an exception you threw yourself - although it could be if you wrote your catch clause incorrectly. More likely, the exception you failed to catch comes from the C++ standard library.
In this case, you can tell by looking at the stack trace that the vector at function threw an out_of_range exception. The solution to this problem most likely isn’t to catch this exception, but rather to make sure you avoid doing out-of-range accesses with at. Depending on how your program is designed though, you might want to catch some C++ standard library exceptions.
5.1.3 Exceeding Marmoset limits
Marmoset places several limits on your program to make sure it doesn’t destroy the testing servers. If your program exceeds one of Marmoset’s limits, Marmoset will instantly terminate your program without letting it clean anything up, generally causing a memory leak. This can be a little confusing because your program doesn’t necessarily have an actual memory leak; the actual problem is that you are exceeding one of Marmoset’s limits.
Time limits
Marmoset places a limit on the amount of time your program takes. If you see the following message in your Valgrind output, you are probably exceeding the time limit:
Process terminating with default action of signal 24 (SIGXCPU) |
SIGXCPU is a signal that indicates the process exceeded its time limit. Look for efficiency issues in your program. Some common ones are:
Passing large data structures by value instead of by reference (which creates a copy of the data structure every time).
Using an inappropriate data structure. If you need to store a sequence of values, vectors are often what you want. However, if you need to look up values based on a key (like for a symbol table) you should use a map instead of a vector of pairs; iterating over a vector to search for a key is slow. Additionally, if you need to delete values from the start of the sequence of values, you should not use a vector because each deletion from the start requires all the other elements to be shifted. Use a deque or list instead.
Using an appropriate data structure, but using it incorrectly. For example, if you are using a map, you should not iterate over all the keys in the map to find a particular key. Use the proper lookup functions which are more efficient.
Output limits
Marmoset places a limit on the amount of output your program produces. Usually you will not exceed this limit unless one of the following situations happens:
You accidentally leave debug prints in your program, causing your program to produce a ton of debugging output.
Your program gets stuck in an infinite loop and produces an infinite amount of output. Even in this case, you might hit the time limit before the output limit.
As long as you remember to disable or remove debug printing before submitting your program, you shouldn’t have to deal with this issue.
If you do exceed the output limit, Marmoset should print an informative error message.
Memory limits
Marmoset places a limit on the amount of memory your program uses. The memory limit is around 256 MB. You can enforce your own memory limit using the ulimit command.
(ulimit -d 262144 && valgrind ./yourProgram) |
The brackets in the command above run ulimit in a subshell, which means other commands will not get affected by the memory limit. The number 262144 is the memory limit in kilobytes. 256 MB is 262144 KB.
The message Valgrind gives if you exceed the memory limit is sometimes very strange and long, but it should say something like this somewhere:
==63134== Valgrind's memory management: out of memory: |
==63134== newSuperblock's request for 4194304 bytes failed. |
==63134== 264,896,512 bytes have already been mmap-ed ANONYMOUS. |
==63134== Valgrind cannot continue. Sorry. |