| 1 |
Debugging GNU Emacs |
|---|
| 2 |
|
|---|
| 3 |
Copyright (C) 1985, 2000, 2001, 2002, 2003, 2004, |
|---|
| 4 |
2005, 2006, 2007, 2008 Free Software Foundation, Inc. |
|---|
| 5 |
See the end of the file for license conditions. |
|---|
| 6 |
|
|---|
| 7 |
|
|---|
| 8 |
[People who debug Emacs on Windows using Microsoft debuggers |
|---|
| 9 |
should read the Windows-specific section near the end of this |
|---|
| 10 |
document.] |
|---|
| 11 |
|
|---|
| 12 |
** When you debug Emacs with GDB, you should start it in the directory |
|---|
| 13 |
where the executable was made. That directory has a .gdbinit file |
|---|
| 14 |
that defines various "user-defined" commands for debugging Emacs. |
|---|
| 15 |
(These commands are described below under "Examining Lisp object |
|---|
| 16 |
values" and "Debugging Emacs Redisplay problems".) |
|---|
| 17 |
|
|---|
| 18 |
** When you are trying to analyze failed assertions, it will be |
|---|
| 19 |
essential to compile Emacs either completely without optimizations or |
|---|
| 20 |
at least (when using GCC) with the -fno-crossjumping option. Failure |
|---|
| 21 |
to do so may make the compiler recycle the same abort call for all |
|---|
| 22 |
assertions in a given function, rendering the stack backtrace useless |
|---|
| 23 |
for identifying the specific failed assertion. |
|---|
| 24 |
|
|---|
| 25 |
** It is a good idea to run Emacs under GDB (or some other suitable |
|---|
| 26 |
debugger) *all the time*. Then, when Emacs crashes, you will be able |
|---|
| 27 |
to debug the live process, not just a core dump. (This is especially |
|---|
| 28 |
important on systems which don't support core files, and instead print |
|---|
| 29 |
just the registers and some stack addresses.) |
|---|
| 30 |
|
|---|
| 31 |
** If Emacs hangs, or seems to be stuck in some infinite loop, typing |
|---|
| 32 |
"kill -TSTP PID", where PID is the Emacs process ID, will cause GDB to |
|---|
| 33 |
kick in, provided that you run under GDB. |
|---|
| 34 |
|
|---|
| 35 |
** Getting control to the debugger |
|---|
| 36 |
|
|---|
| 37 |
`Fsignal' is a very useful place to put a breakpoint in. |
|---|
| 38 |
All Lisp errors go through there. |
|---|
| 39 |
|
|---|
| 40 |
It is useful, when debugging, to have a guaranteed way to return to |
|---|
| 41 |
the debugger at any time. When using X, this is easy: type C-z at the |
|---|
| 42 |
window where Emacs is running under GDB, and it will stop Emacs just |
|---|
| 43 |
as it would stop any ordinary program. When Emacs is running in a |
|---|
| 44 |
terminal, things are not so easy. |
|---|
| 45 |
|
|---|
| 46 |
The src/.gdbinit file in the Emacs distribution arranges for SIGINT |
|---|
| 47 |
(C-g in Emacs) to be passed to Emacs and not give control back to GDB. |
|---|
| 48 |
On modern POSIX systems, you can override that with this command: |
|---|
| 49 |
|
|---|
| 50 |
handle SIGINT stop nopass |
|---|
| 51 |
|
|---|
| 52 |
After this `handle' command, SIGINT will return control to GDB. If |
|---|
| 53 |
you want the C-g to cause a QUIT within Emacs as well, omit the |
|---|
| 54 |
`nopass'. |
|---|
| 55 |
|
|---|
| 56 |
A technique that can work when `handle SIGINT' does not is to store |
|---|
| 57 |
the code for some character into the variable stop_character. Thus, |
|---|
| 58 |
|
|---|
| 59 |
set stop_character = 29 |
|---|
| 60 |
|
|---|
| 61 |
makes Control-] (decimal code 29) the stop character. |
|---|
| 62 |
Typing Control-] will cause immediate stop. You cannot |
|---|
| 63 |
use the set command until the inferior process has been started. |
|---|
| 64 |
Put a breakpoint early in `main', or suspend the Emacs, |
|---|
| 65 |
to get an opportunity to do the set command. |
|---|
| 66 |
|
|---|
| 67 |
When Emacs is running in a terminal, it is useful to use a separate terminal |
|---|
| 68 |
for the debug session. This can be done by starting Emacs as usual, then |
|---|
| 69 |
attaching to it from gdb with the `attach' command which is explained in the |
|---|
| 70 |
node "Attach" of the GDB manual. |
|---|
| 71 |
|
|---|
| 72 |
** Examining Lisp object values. |
|---|
| 73 |
|
|---|
| 74 |
When you have a live process to debug, and it has not encountered a |
|---|
| 75 |
fatal error, you can use the GDB command `pr'. First print the value |
|---|
| 76 |
in the ordinary way, with the `p' command. Then type `pr' with no |
|---|
| 77 |
arguments. This calls a subroutine which uses the Lisp printer. |
|---|
| 78 |
|
|---|
| 79 |
You can also use `pp value' to print the emacs value directly. |
|---|
| 80 |
|
|---|
| 81 |
To see the current value of a Lisp Variable, use `pv variable'. |
|---|
| 82 |
|
|---|
| 83 |
Note: It is not a good idea to try `pr', `pp', or `pv' if you know that Emacs |
|---|
| 84 |
is in deep trouble: its stack smashed (e.g., if it encountered SIGSEGV |
|---|
| 85 |
due to stack overflow), or crucial data structures, such as `obarray', |
|---|
| 86 |
corrupted, etc. In such cases, the Emacs subroutine called by `pr' |
|---|
| 87 |
might make more damage, like overwrite some data that is important for |
|---|
| 88 |
debugging the original problem. |
|---|
| 89 |
|
|---|
| 90 |
Also, on some systems it is impossible to use `pr' if you stopped |
|---|
| 91 |
Emacs while it was inside `select'. This is in fact what happens if |
|---|
| 92 |
you stop Emacs while it is waiting. In such a situation, don't try to |
|---|
| 93 |
use `pr'. Instead, use `s' to step out of the system call. Then |
|---|
| 94 |
Emacs will be between instructions and capable of handling `pr'. |
|---|
| 95 |
|
|---|
| 96 |
If you can't use `pr' command, for whatever reason, you can use the |
|---|
| 97 |
`xpr' command to print out the data type and value of the last data |
|---|
| 98 |
value, For example: |
|---|
| 99 |
|
|---|
| 100 |
p it->object |
|---|
| 101 |
xpr |
|---|
| 102 |
|
|---|
| 103 |
You may also analyze data values using lower-level commands. Use the |
|---|
| 104 |
`xtype' command to print out the data type of the last data value. |
|---|
| 105 |
Once you know the data type, use the command that corresponds to that |
|---|
| 106 |
type. Here are these commands: |
|---|
| 107 |
|
|---|
| 108 |
xint xptr xwindow xmarker xoverlay xmiscfree xintfwd xboolfwd xobjfwd |
|---|
| 109 |
xbufobjfwd xkbobjfwd xbuflocal xbuffer xsymbol xstring xvector xframe |
|---|
| 110 |
xwinconfig xcompiled xcons xcar xcdr xsubr xprocess xfloat xscrollbar |
|---|
| 111 |
|
|---|
| 112 |
Each one of them applies to a certain type or class of types. |
|---|
| 113 |
(Some of these types are not visible in Lisp, because they exist only |
|---|
| 114 |
internally.) |
|---|
| 115 |
|
|---|
| 116 |
Each x... command prints some information about the value, and |
|---|
| 117 |
produces a GDB value (subsequently available in $) through which you |
|---|
| 118 |
can get at the rest of the contents. |
|---|
| 119 |
|
|---|
| 120 |
In general, most of the rest of the contents will be additional Lisp |
|---|
| 121 |
objects which you can examine in turn with the x... commands. |
|---|
| 122 |
|
|---|
| 123 |
Even with a live process, these x... commands are useful for |
|---|
| 124 |
examining the fields in a buffer, window, process, frame or marker. |
|---|
| 125 |
Here's an example using concepts explained in the node "Value History" |
|---|
| 126 |
of the GDB manual to print values associated with the variable |
|---|
| 127 |
called frame. First, use these commands: |
|---|
| 128 |
|
|---|
| 129 |
cd src |
|---|
| 130 |
gdb emacs |
|---|
| 131 |
b set_frame_buffer_list |
|---|
| 132 |
r -q |
|---|
| 133 |
|
|---|
| 134 |
Then Emacs hits the breakpoint: |
|---|
| 135 |
|
|---|
| 136 |
(gdb) p frame |
|---|
| 137 |
$1 = 139854428 |
|---|
| 138 |
(gdb) xpr |
|---|
| 139 |
Lisp_Vectorlike |
|---|
| 140 |
PVEC_FRAME |
|---|
| 141 |
$2 = (struct frame *) 0x8560258 |
|---|
| 142 |
"emacs@localhost" |
|---|
| 143 |
(gdb) p *$ |
|---|
| 144 |
$3 = { |
|---|
| 145 |
size = 1073742931, |
|---|
| 146 |
next = 0x85dfe58, |
|---|
| 147 |
name = 140615219, |
|---|
| 148 |
[...] |
|---|
| 149 |
} |
|---|
| 150 |
|
|---|
| 151 |
Now we can use `pr' to print the frame parameters: |
|---|
| 152 |
|
|---|
| 153 |
(gdb) pp $->param_alist |
|---|
| 154 |
((background-mode . light) (display-type . color) [...]) |
|---|
| 155 |
|
|---|
| 156 |
|
|---|
| 157 |
The Emacs C code heavily uses macros defined in lisp.h. So suppose |
|---|
| 158 |
we want the address of the l-value expression near the bottom of |
|---|
| 159 |
`add_command_key' from keyboard.c: |
|---|
| 160 |
|
|---|
| 161 |
XVECTOR (this_command_keys)->contents[this_command_key_count++] = key; |
|---|
| 162 |
|
|---|
| 163 |
XVECTOR is a macro, so GDB only knows about it if Emacs has been compiled with |
|---|
| 164 |
preprocessor macro information. GCC provides this if you specify the options |
|---|
| 165 |
`-gdwarf-2' and `-g3'. In this case, GDB can evaluate expressions like |
|---|
| 166 |
"p XVECTOR (this_command_keys)". |
|---|
| 167 |
|
|---|
| 168 |
When this information isn't available, you can use the xvector command in GDB |
|---|
| 169 |
to get the same result. Here is how: |
|---|
| 170 |
|
|---|
| 171 |
(gdb) p this_command_keys |
|---|
| 172 |
$1 = 1078005760 |
|---|
| 173 |
(gdb) xvector |
|---|
| 174 |
$2 = (struct Lisp_Vector *) 0x411000 |
|---|
| 175 |
0 |
|---|
| 176 |
(gdb) p $->contents[this_command_key_count] |
|---|
| 177 |
$3 = 1077872640 |
|---|
| 178 |
(gdb) p &$ |
|---|
| 179 |
$4 = (int *) 0x411008 |
|---|
| 180 |
|
|---|
| 181 |
Here's a related example of macros and the GDB `define' command. |
|---|
| 182 |
There are many Lisp vectors such as `recent_keys', which contains the |
|---|
| 183 |
last 100 keystrokes. We can print this Lisp vector |
|---|
| 184 |
|
|---|
| 185 |
p recent_keys |
|---|
| 186 |
pr |
|---|
| 187 |
|
|---|
| 188 |
But this may be inconvenient, since `recent_keys' is much more verbose |
|---|
| 189 |
than `C-h l'. We might want to print only the last 10 elements of |
|---|
| 190 |
this vector. `recent_keys' is updated in keyboard.c by the command |
|---|
| 191 |
|
|---|
| 192 |
XVECTOR (recent_keys)->contents[recent_keys_index] = c; |
|---|
| 193 |
|
|---|
| 194 |
So we define a GDB command `xvector-elts', so the last 10 keystrokes |
|---|
| 195 |
are printed by |
|---|
| 196 |
|
|---|
| 197 |
xvector-elts recent_keys recent_keys_index 10 |
|---|
| 198 |
|
|---|
| 199 |
where you can define xvector-elts as follows: |
|---|
| 200 |
|
|---|
| 201 |
define xvector-elts |
|---|
| 202 |
set $i = 0 |
|---|
| 203 |
p $arg0 |
|---|
| 204 |
xvector |
|---|
| 205 |
set $foo = $ |
|---|
| 206 |
while $i < $arg2 |
|---|
| 207 |
p $foo->contents[$arg1-($i++)] |
|---|
| 208 |
pr |
|---|
| 209 |
end |
|---|
| 210 |
document xvector-elts |
|---|
| 211 |
Prints a range of elements of a Lisp vector. |
|---|
| 212 |
xvector-elts v n i |
|---|
| 213 |
prints `i' elements of the vector `v' ending at the index `n'. |
|---|
| 214 |
end |
|---|
| 215 |
|
|---|
| 216 |
** Getting Lisp-level backtrace information within GDB |
|---|
| 217 |
|
|---|
| 218 |
The most convenient way is to use the `xbacktrace' command. This |
|---|
| 219 |
shows the names of the Lisp functions that are currently active. |
|---|
| 220 |
|
|---|
| 221 |
If that doesn't work (e.g., because the `backtrace_list' structure is |
|---|
| 222 |
corrupted), type "bt" at the GDB prompt, to produce the C-level |
|---|
| 223 |
backtrace, and look for stack frames that call Ffuncall. Select them |
|---|
| 224 |
one by one in GDB, by typing "up N", where N is the appropriate number |
|---|
| 225 |
of frames to go up, and in each frame that calls Ffuncall type this: |
|---|
| 226 |
|
|---|
| 227 |
p *args |
|---|
| 228 |
pr |
|---|
| 229 |
|
|---|
| 230 |
This will print the name of the Lisp function called by that level |
|---|
| 231 |
of function calling. |
|---|
| 232 |
|
|---|
| 233 |
By printing the remaining elements of args, you can see the argument |
|---|
| 234 |
values. Here's how to print the first argument: |
|---|
| 235 |
|
|---|
| 236 |
p args[1] |
|---|
| 237 |
pr |
|---|
| 238 |
|
|---|
| 239 |
If you do not have a live process, you can use xtype and the other |
|---|
| 240 |
x... commands such as xsymbol to get such information, albeit less |
|---|
| 241 |
conveniently. For example: |
|---|
| 242 |
|
|---|
| 243 |
p *args |
|---|
| 244 |
xtype |
|---|
| 245 |
|
|---|
| 246 |
and, assuming that "xtype" says that args[0] is a symbol: |
|---|
| 247 |
|
|---|
| 248 |
xsymbol |
|---|
| 249 |
|
|---|
| 250 |
** Debugging Emacs Redisplay problems |
|---|
| 251 |
|
|---|
| 252 |
The src/.gdbinit file defines many useful commands for dumping redisplay |
|---|
| 253 |
related data structures in a terse and user-friendly format: |
|---|
| 254 |
|
|---|
| 255 |
`ppt' prints value of PT, narrowing, and gap in current buffer. |
|---|
| 256 |
`pit' dumps the current display iterator `it'. |
|---|
| 257 |
`pwin' dumps the current window 'win'. |
|---|
| 258 |
`prow' dumps the current glyph_row `row'. |
|---|
| 259 |
`pg' dumps the current glyph `glyph'. |
|---|
| 260 |
`pgi' dumps the next glyph. |
|---|
| 261 |
`pgrow' dumps all glyphs in current glyph_row `row'. |
|---|
| 262 |
`pcursor' dumps current output_cursor. |
|---|
| 263 |
|
|---|
| 264 |
The above commands also exist in a version with an `x' suffix which |
|---|
| 265 |
takes an object of the relevant type as argument. |
|---|
| 266 |
|
|---|
| 267 |
** Following longjmp call. |
|---|
| 268 |
|
|---|
| 269 |
Recent versions of glibc (2.4+?) encrypt stored values for setjmp/longjmp which |
|---|
| 270 |
prevents GDB from being able to follow a longjmp call using `next'. To |
|---|
| 271 |
disable this protection you need to set the environment variable |
|---|
| 272 |
LD_POINTER_GUARD to 0. |
|---|
| 273 |
|
|---|
| 274 |
** Using GDB in Emacs |
|---|
| 275 |
|
|---|
| 276 |
Debugging with GDB in Emacs offers some advantages over the command line (See |
|---|
| 277 |
the GDB Graphical Interface node of the Emacs manual). There are also some |
|---|
| 278 |
features available just for debugging Emacs: |
|---|
| 279 |
|
|---|
| 280 |
1) The command gud-pp is available on the tool bar (the `pp' icon) and |
|---|
| 281 |
allows the user to print the s-expression of the variable at point, |
|---|
| 282 |
in the GUD buffer. |
|---|
| 283 |
|
|---|
| 284 |
2) Pressing `p' on a component of a watch expression that is a lisp object |
|---|
| 285 |
in the speedbar prints its s-expression in the GUD buffer. |
|---|
| 286 |
|
|---|
| 287 |
3) The STOP button on the tool bar is adjusted so that it sends SIGTSTP |
|---|
| 288 |
instead of the usual SIGINT. |
|---|
| 289 |
|
|---|
| 290 |
4) The command gud-pv has the global binding 'C-x C-a C-v' and prints the |
|---|
| 291 |
value of the lisp variable at point. |
|---|
| 292 |
|
|---|
| 293 |
** Debugging what happens while preloading and dumping Emacs |
|---|
| 294 |
|
|---|
| 295 |
Type `gdb temacs' and start it with `r -batch -l loadup dump'. |
|---|
| 296 |
|
|---|
| 297 |
If temacs actually succeeds when running under GDB in this way, do not |
|---|
| 298 |
try to run the dumped Emacs, because it was dumped with the GDB |
|---|
| 299 |
breakpoints in it. |
|---|
| 300 |
|
|---|
| 301 |
** Debugging `temacs' |
|---|
| 302 |
|
|---|
| 303 |
Debugging `temacs' is useful when you want to establish whether a |
|---|
| 304 |
problem happens in an undumped Emacs. To run `temacs' under a |
|---|
| 305 |
debugger, type "gdb temacs", then start it with `r -batch -l loadup'. |
|---|
| 306 |
|
|---|
| 307 |
** If you encounter X protocol errors |
|---|
| 308 |
|
|---|
| 309 |
The X server normally reports protocol errors asynchronously, |
|---|
| 310 |
so you find out about them long after the primitive which caused |
|---|
| 311 |
the error has returned. |
|---|
| 312 |
|
|---|
| 313 |
To get clear information about the cause of an error, try evaluating |
|---|
| 314 |
(x-synchronize t). That puts Emacs into synchronous mode, where each |
|---|
| 315 |
Xlib call checks for errors before it returns. This mode is much |
|---|
| 316 |
slower, but when you get an error, you will see exactly which call |
|---|
| 317 |
really caused the error. |
|---|
| 318 |
|
|---|
| 319 |
You can start Emacs in a synchronous mode by invoking it with the -xrm |
|---|
| 320 |
option, like this: |
|---|
| 321 |
|
|---|
| 322 |
emacs -xrm "emacs.synchronous: true" |
|---|
| 323 |
|
|---|
| 324 |
Setting a breakpoint in the function `x_error_quitter' and looking at |
|---|
| 325 |
the backtrace when Emacs stops inside that function will show what |
|---|
| 326 |
code causes the X protocol errors. |
|---|
| 327 |
|
|---|
| 328 |
Some bugs related to the X protocol disappear when Emacs runs in a |
|---|
| 329 |
synchronous mode. To track down those bugs, we suggest the following |
|---|
| 330 |
procedure: |
|---|
| 331 |
|
|---|
| 332 |
- Run Emacs under a debugger and put a breakpoint inside the |
|---|
| 333 |
primitive function which, when called from Lisp, triggers the X |
|---|
| 334 |
protocol errors. For example, if the errors happen when you |
|---|
| 335 |
delete a frame, put a breakpoint inside `Fdelete_frame'. |
|---|
| 336 |
|
|---|
| 337 |
- When the breakpoint breaks, step through the code, looking for |
|---|
| 338 |
calls to X functions (the ones whose names begin with "X" or |
|---|
| 339 |
"Xt" or "Xm"). |
|---|
| 340 |
|
|---|
| 341 |
- Insert calls to `XSync' before and after each call to the X |
|---|
| 342 |
functions, like this: |
|---|
| 343 |
|
|---|
| 344 |
XSync (f->output_data.x->display_info->display, 0); |
|---|
| 345 |
|
|---|
| 346 |
where `f' is the pointer to the `struct frame' of the selected |
|---|
| 347 |
frame, normally available via XFRAME (selected_frame). (Most |
|---|
| 348 |
functions which call X already have some variable that holds the |
|---|
| 349 |
pointer to the frame, perhaps called `f' or `sf', so you shouldn't |
|---|
| 350 |
need to compute it.) |
|---|
| 351 |
|
|---|
| 352 |
If your debugger can call functions in the program being debugged, |
|---|
| 353 |
you should be able to issue the calls to `XSync' without recompiling |
|---|
| 354 |
Emacs. For example, with GDB, just type: |
|---|
| 355 |
|
|---|
| 356 |
call XSync (f->output_data.x->display_info->display, 0) |
|---|
| 357 |
|
|---|
| 358 |
before and immediately after the suspect X calls. If your |
|---|
| 359 |
debugger does not support this, you will need to add these pairs |
|---|
| 360 |
of calls in the source and rebuild Emacs. |
|---|
| 361 |
|
|---|
| 362 |
Either way, systematically step through the code and issue these |
|---|
| 363 |
calls until you find the first X function called by Emacs after |
|---|
| 364 |
which a call to `XSync' winds up in the function |
|---|
| 365 |
`x_error_quitter'. The first X function call for which this |
|---|
| 366 |
happens is the one that generated the X protocol error. |
|---|
| 367 |
|
|---|
| 368 |
- You should now look around this offending X call and try to figure |
|---|
| 369 |
out what is wrong with it. |
|---|
| 370 |
|
|---|
| 371 |
** If Emacs causes errors or memory leaks in your X server |
|---|
| 372 |
|
|---|
| 373 |
You can trace the traffic between Emacs and your X server with a tool |
|---|
| 374 |
like xmon, available at ftp://ftp.x.org/contrib/devel_tools/. |
|---|
| 375 |
|
|---|
| 376 |
Xmon can be used to see exactly what Emacs sends when X protocol errors |
|---|
| 377 |
happen. If Emacs causes the X server memory usage to increase you can |
|---|
| 378 |
use xmon to see what items Emacs creates in the server (windows, |
|---|
| 379 |
graphical contexts, pixmaps) and what items Emacs delete. If there |
|---|
| 380 |
are consistently more creations than deletions, the type of item |
|---|
| 381 |
and the activity you do when the items get created can give a hint where |
|---|
| 382 |
to start debugging. |
|---|
| 383 |
|
|---|
| 384 |
** If the symptom of the bug is that Emacs fails to respond |
|---|
| 385 |
|
|---|
| 386 |
Don't assume Emacs is `hung'--it may instead be in an infinite loop. |
|---|
| 387 |
To find out which, make the problem happen under GDB and stop Emacs |
|---|
| 388 |
once it is not responding. (If Emacs is using X Windows directly, you |
|---|
| 389 |
can stop Emacs by typing C-z at the GDB job.) Then try stepping with |
|---|
| 390 |
`step'. If Emacs is hung, the `step' command won't return. If it is |
|---|
| 391 |
looping, `step' will return. |
|---|
| 392 |
|
|---|
| 393 |
If this shows Emacs is hung in a system call, stop it again and |
|---|
| 394 |
examine the arguments of the call. If you report the bug, it is very |
|---|
| 395 |
important to state exactly where in the source the system call is, and |
|---|
| 396 |
what the arguments are. |
|---|
| 397 |
|
|---|
| 398 |
If Emacs is in an infinite loop, try to determine where the loop |
|---|
| 399 |
starts and ends. The easiest way to do this is to use the GDB command |
|---|
| 400 |
`finish'. Each time you use it, Emacs resumes execution until it |
|---|
| 401 |
exits one stack frame. Keep typing `finish' until it doesn't |
|---|
| 402 |
return--that means the infinite loop is in the stack frame which you |
|---|
| 403 |
just tried to finish. |
|---|
| 404 |
|
|---|
| 405 |
Stop Emacs again, and use `finish' repeatedly again until you get back |
|---|
| 406 |
to that frame. Then use `next' to step through that frame. By |
|---|
| 407 |
stepping, you will see where the loop starts and ends. Also, examine |
|---|
| 408 |
the data being used in the loop and try to determine why the loop does |
|---|
| 409 |
not exit when it should. |
|---|
| 410 |
|
|---|
| 411 |
** If certain operations in Emacs are slower than they used to be, here |
|---|
| 412 |
is some advice for how to find out why. |
|---|
| 413 |
|
|---|
| 414 |
Stop Emacs repeatedly during the slow operation, and make a backtrace |
|---|
| 415 |
each time. Compare the backtraces looking for a pattern--a specific |
|---|
| 416 |
function that shows up more often than you'd expect. |
|---|
| 417 |
|
|---|
| 418 |
If you don't see a pattern in the C backtraces, get some Lisp |
|---|
| 419 |
backtrace information by typing "xbacktrace" or by looking at Ffuncall |
|---|
| 420 |
frames (see above), and again look for a pattern. |
|---|
| 421 |
|
|---|
| 422 |
When using X, you can stop Emacs at any time by typing C-z at GDB. |
|---|
| 423 |
When not using X, you can do this with C-g. On non-Unix platforms, |
|---|
| 424 |
such as MS-DOS, you might need to press C-BREAK instead. |
|---|
| 425 |
|
|---|
| 426 |
** If GDB does not run and your debuggers can't load Emacs. |
|---|
| 427 |
|
|---|
| 428 |
On some systems, no debugger can load Emacs with a symbol table, |
|---|
| 429 |
perhaps because they all have fixed limits on the number of symbols |
|---|
| 430 |
and Emacs exceeds the limits. Here is a method that can be used |
|---|
| 431 |
in such an extremity. Do |
|---|
| 432 |
|
|---|
| 433 |
nm -n temacs > nmout |
|---|
| 434 |
strip temacs |
|---|
| 435 |
adb temacs |
|---|
| 436 |
0xd:i |
|---|
| 437 |
0xe:i |
|---|
| 438 |
14:i |
|---|
| 439 |
17:i |
|---|
| 440 |
:r -l loadup (or whatever) |
|---|
| 441 |
|
|---|
| 442 |
It is necessary to refer to the file `nmout' to convert |
|---|
| 443 |
numeric addresses into symbols and vice versa. |
|---|
| 444 |
|
|---|
| 445 |
It is useful to be running under a window system. |
|---|
| 446 |
Then, if Emacs becomes hopelessly wedged, you can create |
|---|
| 447 |
another window to do kill -9 in. kill -ILL is often |
|---|
| 448 |
useful too, since that may make Emacs dump core or return |
|---|
| 449 |
to adb. |
|---|
| 450 |
|
|---|
| 451 |
|
|---|
| 452 |
** Debugging incorrect screen updating. |
|---|
| 453 |
|
|---|
| 454 |
To debug Emacs problems that update the screen wrong, it is useful |
|---|
| 455 |
to have a record of what input you typed and what Emacs sent to the |
|---|
| 456 |
screen. To make these records, do |
|---|
| 457 |
|
|---|
| 458 |
(open-dribble-file "~/.dribble") |
|---|
| 459 |
(open-termscript "~/.termscript") |
|---|
| 460 |
|
|---|
| 461 |
The dribble file contains all characters read by Emacs from the |
|---|
| 462 |
terminal, and the termscript file contains all characters it sent to |
|---|
| 463 |
the terminal. The use of the directory `~/' prevents interference |
|---|
| 464 |
with any other user. |
|---|
| 465 |
|
|---|
| 466 |
If you have irreproducible display problems, put those two expressions |
|---|
| 467 |
in your ~/.emacs file. When the problem happens, exit the Emacs that |
|---|
| 468 |
you were running, kill it, and rename the two files. Then you can start |
|---|
| 469 |
another Emacs without clobbering those files, and use it to examine them. |
|---|
| 470 |
|
|---|
| 471 |
An easy way to see if too much text is being redrawn on a terminal is to |
|---|
| 472 |
evaluate `(setq inverse-video t)' before you try the operation you think |
|---|
| 473 |
will cause too much redrawing. This doesn't refresh the screen, so only |
|---|
| 474 |
newly drawn text is in inverse video. |
|---|
| 475 |
|
|---|
| 476 |
The Emacs display code includes special debugging code, but it is |
|---|
| 477 |
normally disabled. You can enable it by building Emacs with the |
|---|
| 478 |
pre-processing symbol GLYPH_DEBUG defined. Here's one easy way, |
|---|
| 479 |
suitable for Unix and GNU systems, to build such a debugging version: |
|---|
| 480 |
|
|---|
| 481 |
MYCPPFLAGS='-DGLYPH_DEBUG=1' make |
|---|
| 482 |
|
|---|
| 483 |
Building Emacs like that activates many assertions which scrutinize |
|---|
| 484 |
display code operation more than Emacs does normally. (To see the |
|---|
| 485 |
code which tests these assertions, look for calls to the `xassert' |
|---|
| 486 |
macros.) Any assertion that is reported to fail should be |
|---|
| 487 |
investigated. |
|---|
| 488 |
|
|---|
| 489 |
Building with GLYPH_DEBUG defined also defines several helper |
|---|
| 490 |
functions which can help debugging display code. One such function is |
|---|
| 491 |
`dump_glyph_matrix'. If you run Emacs under GDB, you can print the |
|---|
| 492 |
contents of any glyph matrix by just calling that function with the |
|---|
| 493 |
matrix as its argument. For example, the following command will print |
|---|
| 494 |
the contents of the current matrix of the window whose pointer is in |
|---|
| 495 |
`w': |
|---|
| 496 |
|
|---|
| 497 |
(gdb) p dump_glyph_matrix (w->current_matrix, 2) |
|---|
| 498 |
|
|---|
| 499 |
(The second argument 2 tells dump_glyph_matrix to print the glyphs in |
|---|
| 500 |
a long form.) You can dump the selected window's current glyph matrix |
|---|
| 501 |
interactively with "M-x dump-glyph-matrix RET"; see the documentation |
|---|
| 502 |
of this function for more details. |
|---|
| 503 |
|
|---|
| 504 |
Several more functions for debugging display code are available in |
|---|
| 505 |
Emacs compiled with GLYPH_DEBUG defined; type "C-h f dump- TAB" and |
|---|
| 506 |
"C-h f trace- TAB" to see the full list. |
|---|
| 507 |
|
|---|
| 508 |
When you debug display problems running emacs under X, you can use |
|---|
| 509 |
the `ff' command to flush all pending display updates to the screen. |
|---|
| 510 |
|
|---|
| 511 |
|
|---|
| 512 |
** Debugging LessTif |
|---|
| 513 |
|
|---|
| 514 |
If you encounter bugs whereby Emacs built with LessTif grabs all mouse |
|---|
| 515 |
and keyboard events, or LessTif menus behave weirdly, it might be |
|---|
| 516 |
helpful to set the `DEBUGSOURCES' and `DEBUG_FILE' environment |
|---|
| 517 |
variables, so that one can see what LessTif was doing at this point. |
|---|
| 518 |
For instance |
|---|
| 519 |
|
|---|
| 520 |
export DEBUGSOURCES="RowColumn.c:MenuShell.c:MenuUtil.c" |
|---|
| 521 |
export DEBUG_FILE=/usr/tmp/LESSTIF_TRACE |
|---|
| 522 |
emacs & |
|---|
| 523 |
|
|---|
| 524 |
causes LessTif to print traces from the three named source files to a |
|---|
| 525 |
file in `/usr/tmp' (that file can get pretty large). The above should |
|---|
| 526 |
be typed at the shell prompt before invoking Emacs, as shown by the |
|---|
| 527 |
last line above. |
|---|
| 528 |
|
|---|
| 529 |
Running GDB from another terminal could also help with such problems. |
|---|
| 530 |
You can arrange for GDB to run on one machine, with the Emacs display |
|---|
| 531 |
appearing on another. Then, when the bug happens, you can go back to |
|---|
| 532 |
the machine where you started GDB and use the debugger from there. |
|---|
| 533 |
|
|---|
| 534 |
|
|---|
| 535 |
** Debugging problems which happen in GC |
|---|
| 536 |
|
|---|
| 537 |
The array `last_marked' (defined on alloc.c) can be used to display up |
|---|
| 538 |
to 500 last objects marked by the garbage collection process. |
|---|
| 539 |
Whenever the garbage collector marks a Lisp object, it records the |
|---|
| 540 |
pointer to that object in the `last_marked' array, which is maintained |
|---|
| 541 |
as a circular buffer. The variable `last_marked_index' holds the |
|---|
| 542 |
index into the `last_marked' array one place beyond where the pointer |
|---|
| 543 |
to the very last marked object is stored. |
|---|
| 544 |
|
|---|
| 545 |
The single most important goal in debugging GC problems is to find the |
|---|
| 546 |
Lisp data structure that got corrupted. This is not easy since GC |
|---|
| 547 |
changes the tag bits and relocates strings which make it hard to look |
|---|
| 548 |
at Lisp objects with commands such as `pr'. It is sometimes necessary |
|---|
| 549 |
to convert Lisp_Object variables into pointers to C struct's manually. |
|---|
| 550 |
|
|---|
| 551 |
Use the `last_marked' array and the source to reconstruct the sequence |
|---|
| 552 |
that objects were marked. In general, you need to correlate the |
|---|
| 553 |
values recorded in the `last_marked' array with the corresponding |
|---|
| 554 |
stack frames in the backtrace, beginning with the innermost frame. |
|---|
| 555 |
Some subroutines of `mark_object' are invoked recursively, others loop |
|---|
| 556 |
over portions of the data structure and mark them as they go. By |
|---|
| 557 |
looking at the code of those routines and comparing the frames in the |
|---|
| 558 |
backtrace with the values in `last_marked', you will be able to find |
|---|
| 559 |
connections between the values in `last_marked'. E.g., when GC finds |
|---|
| 560 |
a cons cell, it recursively marks its car and its cdr. Similar things |
|---|
| 561 |
happen with properties of symbols, elements of vectors, etc. Use |
|---|
| 562 |
these connections to reconstruct the data structure that was being |
|---|
| 563 |
marked, paying special attention to the strings and names of symbols |
|---|
| 564 |
that you encounter: these strings and symbol names can be used to grep |
|---|
| 565 |
the sources to find out what high-level symbols and global variables |
|---|
| 566 |
are involved in the crash. |
|---|
| 567 |
|
|---|
| 568 |
Once you discover the corrupted Lisp object or data structure, grep |
|---|
| 569 |
the sources for its uses and try to figure out what could cause the |
|---|
| 570 |
corruption. If looking at the sources doesn't help, you could try |
|---|
| 571 |
setting a watchpoint on the corrupted data, and see what code modifies |
|---|
| 572 |
it in some invalid way. (Obviously, this technique is only useful for |
|---|
| 573 |
data that is modified only very rarely.) |
|---|
| 574 |
|
|---|
| 575 |
It is also useful to look at the corrupted object or data structure in |
|---|
| 576 |
a fresh Emacs session and compare its contents with a session that you |
|---|
| 577 |
are debugging. |
|---|
| 578 |
|
|---|
| 579 |
** Debugging problems with non-ASCII characters |
|---|
| 580 |
|
|---|
| 581 |
If you experience problems which seem to be related to non-ASCII |
|---|
| 582 |
characters, such as \201 characters appearing in the buffer or in your |
|---|
| 583 |
files, set the variable byte-debug-flag to t. This causes Emacs to do |
|---|
| 584 |
some extra checks, such as look for broken relations between byte and |
|---|
| 585 |
character positions in buffers and strings; the resulting diagnostics |
|---|
| 586 |
might pinpoint the cause of the problem. |
|---|
| 587 |
|
|---|
| 588 |
** Debugging the TTY (non-windowed) version |
|---|
| 589 |
|
|---|
| 590 |
The most convenient method of debugging the character-terminal display |
|---|
| 591 |
is to do that on a window system such as X. Begin by starting an |
|---|
| 592 |
xterm window, then type these commands inside that window: |
|---|
| 593 |
|
|---|
| 594 |
$ tty |
|---|
| 595 |
$ echo $TERM |
|---|
| 596 |
|
|---|
| 597 |
Let's say these commands print "/dev/ttyp4" and "xterm", respectively. |
|---|
| 598 |
|
|---|
| 599 |
Now start Emacs (the normal, windowed-display session, i.e. without |
|---|
| 600 |
the `-nw' option), and invoke "M-x gdb RET emacs RET" from there. Now |
|---|
| 601 |
type these commands at GDB's prompt: |
|---|
| 602 |
|
|---|
| 603 |
(gdb) set args -nw -t /dev/ttyp4 |
|---|
| 604 |
(gdb) set environment TERM xterm |
|---|
| 605 |
(gdb) run |
|---|
| 606 |
|
|---|
| 607 |
The debugged Emacs should now start in no-window mode with its display |
|---|
| 608 |
directed to the xterm window you opened above. |
|---|
| 609 |
|
|---|
| 610 |
Similar arrangement is possible on a character terminal by using the |
|---|
| 611 |
`screen' package. |
|---|
| 612 |
|
|---|
| 613 |
** Running Emacs built with malloc debugging packages |
|---|
| 614 |
|
|---|
| 615 |
If Emacs exhibits bugs that seem to be related to use of memory |
|---|
| 616 |
allocated off the heap, it might be useful to link Emacs with a |
|---|
| 617 |
special debugging library, such as Electric Fence (a.k.a. efence) or |
|---|
| 618 |
GNU Checker, which helps find such problems. |
|---|
| 619 |
|
|---|
| 620 |
Emacs compiled with such packages might not run without some hacking, |
|---|
| 621 |
because Emacs replaces the system's memory allocation functions with |
|---|
| 622 |
its own versions, and because the dumping process might be |
|---|
| 623 |
incompatible with the way these packages use to track allocated |
|---|
| 624 |
memory. Here are some of the changes you might find necessary |
|---|
| 625 |
(SYSTEM-NAME and MACHINE-NAME are the names of your OS- and |
|---|
| 626 |
CPU-specific headers in the subdirectories of `src'): |
|---|
| 627 |
|
|---|
| 628 |
- In src/s/SYSTEM-NAME.h add "#define SYSTEM_MALLOC". |
|---|
| 629 |
|
|---|
| 630 |
- In src/m/MACHINE-NAME.h add "#define CANNOT_DUMP" and |
|---|
| 631 |
"#define CANNOT_UNEXEC". |
|---|
| 632 |
|
|---|
| 633 |
- Configure with a different --prefix= option. If you use GCC, |
|---|
| 634 |
version 2.7.2 is preferred, as some malloc debugging packages |
|---|
| 635 |
work a lot better with it than with 2.95 or later versions. |
|---|
| 636 |
|
|---|
| 637 |
- Type "make" then "make -k install". |
|---|
| 638 |
|
|---|
| 639 |
- If required, invoke the package-specific command to prepare |
|---|
| 640 |
src/temacs for execution. |
|---|
| 641 |
|
|---|
| 642 |
- cd ..; src/temacs |
|---|
| 643 |
|
|---|
| 644 |
(Note that this runs `temacs' instead of the usual `emacs' executable. |
|---|
| 645 |
This avoids problems with dumping Emacs mentioned above.) |
|---|
| 646 |
|
|---|
| 647 |
Some malloc debugging libraries might print lots of false alarms for |
|---|
| 648 |
bitfields used by Emacs in some data structures. If you want to get |
|---|
| 649 |
rid of the false alarms, you will have to hack the definitions of |
|---|
| 650 |
these data structures on the respective headers to remove the `:N' |
|---|
| 651 |
bitfield definitions (which will cause each such field to use a full |
|---|
| 652 |
int). |
|---|
| 653 |
|
|---|
| 654 |
** How to recover buffer contents from an Emacs core dump file |
|---|
| 655 |
|
|---|
| 656 |
The file etc/emacs-buffer.gdb defines a set of GDB commands for |
|---|
| 657 |
recovering the contents of Emacs buffers from a core dump file. You |
|---|
| 658 |
might also find those commands useful for displaying the list of |
|---|
| 659 |
buffers in human-readable format from within the debugger. |
|---|
| 660 |
|
|---|
| 661 |
** Some suggestions for debugging on MS Windows: |
|---|
| 662 |
|
|---|
| 663 |
(written by Marc Fleischeuers, Geoff Voelker and Andrew Innes) |
|---|
| 664 |
|
|---|
| 665 |
To debug Emacs with Microsoft Visual C++, you either start emacs from |
|---|
| 666 |
the debugger or attach the debugger to a running emacs process. |
|---|
| 667 |
|
|---|
| 668 |
To start emacs from the debugger, you can use the file bin/debug.bat. |
|---|
| 669 |
The Microsoft Developer studio will start and under Project, Settings, |
|---|
| 670 |
Debug, General you can set the command-line arguments and Emacs's |
|---|
| 671 |
startup directory. Set breakpoints (Edit, Breakpoints) at Fsignal and |
|---|
| 672 |
other functions that you want to examine. Run the program (Build, |
|---|
| 673 |
Start debug). Emacs will start and the debugger will take control as |
|---|
| 674 |
soon as a breakpoint is hit. |
|---|
| 675 |
|
|---|
| 676 |
You can also attach the debugger to an already running Emacs process. |
|---|
| 677 |
To do this, start up the Microsoft Developer studio and select Build, |
|---|
| 678 |
Start debug, Attach to process. Choose the Emacs process from the |
|---|
| 679 |
list. Send a break to the running process (Debug, Break) and you will |
|---|
| 680 |
find that execution is halted somewhere in user32.dll. Open the stack |
|---|
| 681 |
trace window and go up the stack to w32_msg_pump. Now you can set |
|---|
| 682 |
breakpoints in Emacs (Edit, Breakpoints). Continue the running Emacs |
|---|
| 683 |
process (Debug, Step out) and control will return to Emacs, until a |
|---|
| 684 |
breakpoint is hit. |
|---|
| 685 |
|
|---|
| 686 |
To examine the contents of a Lisp variable, you can use the function |
|---|
| 687 |
'debug_print'. Right-click on a variable, select QuickWatch (it has |
|---|
| 688 |
an eyeglass symbol on its button in the toolbar), and in the text |
|---|
| 689 |
field at the top of the window, place 'debug_print(' and ')' around |
|---|
| 690 |
the expression. Press 'Recalculate' and the output is sent to stderr, |
|---|
| 691 |
and to the debugger via the OutputDebugString routine. The output |
|---|
| 692 |
sent to stderr should be displayed in the console window that was |
|---|
| 693 |
opened when the emacs.exe executable was started. The output sent to |
|---|
| 694 |
the debugger should be displayed in the 'Debug' pane in the Output |
|---|
| 695 |
window. If Emacs was started from the debugger, a console window was |
|---|
| 696 |
opened at Emacs' startup; this console window also shows the output of |
|---|
| 697 |
'debug_print'. |
|---|
| 698 |
|
|---|
| 699 |
For example, start and run Emacs in the debugger until it is waiting |
|---|
| 700 |
for user input. Then click on the `Break' button in the debugger to |
|---|
| 701 |
halt execution. Emacs should halt in `ZwUserGetMessage' waiting for |
|---|
| 702 |
an input event. Use the `Call Stack' window to select the procedure |
|---|
| 703 |
`w32_msp_pump' up the call stack (see below for why you have to do |
|---|
| 704 |
this). Open the QuickWatch window and enter |
|---|
| 705 |
"debug_print(Vexec_path)". Evaluating this expression will then print |
|---|
| 706 |
out the contents of the Lisp variable `exec-path'. |
|---|
| 707 |
|
|---|
| 708 |
If QuickWatch reports that the symbol is unknown, then check the call |
|---|
| 709 |
stack in the `Call Stack' window. If the selected frame in the call |
|---|
| 710 |
stack is not an Emacs procedure, then the debugger won't recognize |
|---|
| 711 |
Emacs symbols. Instead, select a frame that is inside an Emacs |
|---|
| 712 |
procedure and try using `debug_print' again. |
|---|
| 713 |
|
|---|
| 714 |
If QuickWatch invokes debug_print but nothing happens, then check the |
|---|
| 715 |
thread that is selected in the debugger. If the selected thread is |
|---|
| 716 |
not the last thread to run (the "current" thread), then it cannot be |
|---|
| 717 |
used to execute debug_print. Use the Debug menu to select the current |
|---|
| 718 |
thread and try using debug_print again. Note that the debugger halts |
|---|
| 719 |
execution (e.g., due to a breakpoint) in the context of the current |
|---|
| 720 |
thread, so this should only be a problem if you've explicitly switched |
|---|
| 721 |
threads. |
|---|
| 722 |
|
|---|
| 723 |
It is also possible to keep appropriately masked and typecast Lisp |
|---|
| 724 |
symbols in the Watch window, this is more convenient when steeping |
|---|
| 725 |
though the code. For instance, on entering apply_lambda, you can |
|---|
| 726 |
watch (struct Lisp_Symbol *) (0xfffffff & args[0]). |
|---|
| 727 |
|
|---|
| 728 |
Optimizations often confuse the MS debugger. For example, the |
|---|
| 729 |
debugger will sometimes report wrong line numbers, e.g., when it |
|---|
| 730 |
prints the backtrace for a crash. It is usually best to look at the |
|---|
| 731 |
disassembly to determine exactly what code is being run--the |
|---|
| 732 |
disassembly will probably show several source lines followed by a |
|---|
| 733 |
block of assembler for those lines. The actual point where Emacs |
|---|
| 734 |
crashes will be one of those source lines, but not necessarily the one |
|---|
| 735 |
that the debugger reports. |
|---|
| 736 |
|
|---|
| 737 |
Another problematic area with the MS debugger is with variables that |
|---|
| 738 |
are stored in registers: it will sometimes display wrong values for |
|---|
| 739 |
those variables. Usually you will not be able to see any value for a |
|---|
| 740 |
register variable, but if it is only being stored in a register |
|---|
| 741 |
temporarily, you will see an old value for it. Again, you need to |
|---|
| 742 |
look at the disassembly to determine which registers are being used, |
|---|
| 743 |
and look at those registers directly, to see the actual current values |
|---|
| 744 |
of these variables. |
|---|
| 745 |
|
|---|
| 746 |
|
|---|
| 747 |
This file is part of GNU Emacs. |
|---|
| 748 |
|
|---|
| 749 |
GNU Emacs is free software; you can redistribute it and/or modify |
|---|
| 750 |
it under the terms of the GNU General Public License as published by |
|---|
| 751 |
the Free Software Foundation; either version 3, or (at your option) |
|---|
| 752 |
any later version. |
|---|
| 753 |
|
|---|
| 754 |
GNU Emacs is distributed in the hope that it will be useful, |
|---|
| 755 |
but WITHOUT ANY WARRANTY; without even the implied warranty of |
|---|
| 756 |
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
|---|
| 757 |
GNU General Public License for more details. |
|---|
| 758 |
|
|---|
| 759 |
You should have received a copy of the GNU General Public License |
|---|
| 760 |
along with GNU Emacs; see the file COPYING. If not, write to the |
|---|
| 761 |
Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, |
|---|
| 762 |
Boston, MA 02110-1301, USA. |
|---|
| 763 |
|
|---|
| 764 |
|
|---|
| 765 |
Local variables: |
|---|
| 766 |
mode: outline |
|---|
| 767 |
paragraph-separate: "[ ]*$" |
|---|
| 768 |
end: |
|---|
| 769 |
|
|---|
| 770 |
;;; arch-tag: fbf32980-e35d-481f-8e4c-a2eca2586e6b |
|---|