| 247 | | Since characters are really integers, the printed representation of a |
|---|
| 248 | | character is a decimal number. This is also a possible read syntax for |
|---|
| 249 | | a character, but writing characters that way in Lisp programs is a very |
|---|
| 250 | | bad idea. You should @emph{always} use the special read syntax formats |
|---|
| 251 | | that Emacs Lisp provides for characters. These syntax formats start |
|---|
| 252 | | with a question mark. |
|---|
| | 261 | |
|---|
| | 262 | Since characters are really integers, the printed representation of |
|---|
| | 263 | a character is a decimal number. This is also a possible read syntax |
|---|
| | 264 | for a character, but writing characters that way in Lisp programs is |
|---|
| | 265 | not clear programming. You should @emph{always} use the special read |
|---|
| | 266 | syntax formats that Emacs Lisp provides for characters. These syntax |
|---|
| | 267 | formats start with a question mark. |
|---|
| | 333 | A backslash is allowed, and harmless, preceding any character without |
|---|
| | 334 | a special escape meaning; thus, @samp{?\+} is equivalent to @samp{?+}. |
|---|
| | 335 | There is no reason to add a backslash before most characters. However, |
|---|
| | 336 | you should add a backslash before any of the characters |
|---|
| | 337 | @samp{()\|;'`"#.,} to avoid confusing the Emacs commands for editing |
|---|
| | 338 | Lisp code. You can also add a backslash before whitespace characters such as |
|---|
| | 339 | space, tab, newline and formfeed. However, it is cleaner to use one of |
|---|
| | 340 | the easily readable escape sequences, such as @samp{\t} or @samp{\s}, |
|---|
| | 341 | instead of an actual whitespace character such as a tab or a space. |
|---|
| | 342 | (If you do write backslash followed by a space, you should write |
|---|
| | 343 | an extra space after the character constant to separate it from the |
|---|
| | 344 | following text.) |
|---|
| | 345 | |
|---|
| | 346 | @node General Escape Syntax |
|---|
| | 347 | @subsubsection General Escape Syntax |
|---|
| | 348 | |
|---|
| | 349 | In addition to the specific excape sequences for special important |
|---|
| | 350 | control characters, Emacs provides general categories of escape syntax |
|---|
| | 351 | that you can use to specify non-ASCII text characters. |
|---|
| | 352 | |
|---|
| | 353 | @cindex unicode character escape |
|---|
| | 354 | For instance, you can specify characters by their Unicode values. |
|---|
| | 355 | @code{?\u@var{nnnn}} represents a character that maps to the Unicode |
|---|
| | 356 | code point @samp{U+@var{nnnn}}. There is a slightly different syntax |
|---|
| | 357 | for specifying characters with code points above @code{#xFFFF}; |
|---|
| | 358 | @code{\U00@var{nnnnnn}} represents the character whose Unicode code |
|---|
| | 359 | point is @samp{U+@var{nnnnnn}}, if such a character is supported by |
|---|
| | 360 | Emacs. If the corresponding character is not supported, Emacs signals |
|---|
| | 361 | an error. |
|---|
| | 362 | |
|---|
| | 363 | This peculiar and inconvenient syntax was adopted for compatibility |
|---|
| | 364 | with other programming languages. Unlike some other languages, Emacs |
|---|
| | 365 | Lisp supports this syntax in only character literals and strings. |
|---|
| | 366 | |
|---|
| | 367 | @cindex @samp{\} in character constant |
|---|
| | 368 | @cindex backslash in character constant |
|---|
| | 369 | @cindex octal character code |
|---|
| | 370 | The most general read syntax for a character represents the |
|---|
| | 371 | character code in either octal or hex. To use octal, write a question |
|---|
| | 372 | mark followed by a backslash and the octal character code (up to three |
|---|
| | 373 | octal digits); thus, @samp{?\101} for the character @kbd{A}, |
|---|
| | 374 | @samp{?\001} for the character @kbd{C-a}, and @code{?\002} for the |
|---|
| | 375 | character @kbd{C-b}. Although this syntax can represent any |
|---|
| | 376 | @acronym{ASCII} character, it is preferred only when the precise octal |
|---|
| | 377 | value is more important than the @acronym{ASCII} representation. |
|---|
| | 378 | |
|---|
| | 379 | @example |
|---|
| | 380 | @group |
|---|
| | 381 | ?\012 @result{} 10 ?\n @result{} 10 ?\C-j @result{} 10 |
|---|
| | 382 | ?\101 @result{} 65 ?A @result{} 65 |
|---|
| | 383 | @end group |
|---|
| | 384 | @end example |
|---|
| | 385 | |
|---|
| | 386 | To use hex, write a question mark followed by a backslash, @samp{x}, |
|---|
| | 387 | and the hexadecimal character code. You can use any number of hex |
|---|
| | 388 | digits, so you can represent any character code in this way. |
|---|
| | 389 | Thus, @samp{?\x41} for the character @kbd{A}, @samp{?\x1} for the |
|---|
| | 390 | character @kbd{C-a}, and @code{?\x8e0} for the Latin-1 character |
|---|
| | 391 | @iftex |
|---|
| | 392 | @samp{@`a}. |
|---|
| | 393 | @end iftex |
|---|
| | 394 | @ifnottex |
|---|
| | 395 | @samp{a} with grave accent. |
|---|
| | 396 | @end ifnottex |
|---|
| | 397 | |
|---|
| | 398 | @node Ctl-Char Syntax |
|---|
| | 399 | @subsubsection Control-Character Syntax |
|---|
| | 400 | |
|---|
| 434 | | @cindex unicode character escape |
|---|
| 435 | | Emacs provides a syntax for specifying characters by their Unicode |
|---|
| 436 | | code points. @code{?\u@var{nnnn}} represents a character that maps to |
|---|
| 437 | | the Unicode code point @samp{U+@var{nnnn}}. There is a slightly |
|---|
| 438 | | different syntax for specifying characters with code points above |
|---|
| 439 | | @code{#xFFFF}; @code{\U00@var{nnnnnn}} represents the character whose |
|---|
| 440 | | Unicode code point is @samp{U+@var{nnnnnn}}, if such a character |
|---|
| 441 | | is supported by Emacs. If the corresponding character is not |
|---|
| 442 | | supported, Emacs signals an error. |
|---|
| 443 | | |
|---|
| 444 | | This peculiar and inconvenient syntax was adopted for compatibility |
|---|
| 445 | | with other programming languages. Unlike some other languages, Emacs |
|---|
| 446 | | Lisp supports this syntax in only character literals and strings. |
|---|
| 447 | | |
|---|
| 448 | | @cindex @samp{\} in character constant |
|---|
| 449 | | @cindex backslash in character constant |
|---|
| 450 | | @cindex octal character code |
|---|
| 451 | | Finally, the most general read syntax for a character represents the |
|---|
| 452 | | character code in either octal or hex. To use octal, write a question |
|---|
| 453 | | mark followed by a backslash and the octal character code (up to three |
|---|
| 454 | | octal digits); thus, @samp{?\101} for the character @kbd{A}, |
|---|
| 455 | | @samp{?\001} for the character @kbd{C-a}, and @code{?\002} for the |
|---|
| 456 | | character @kbd{C-b}. Although this syntax can represent any @acronym{ASCII} |
|---|
| 457 | | character, it is preferred only when the precise octal value is more |
|---|
| 458 | | important than the @acronym{ASCII} representation. |
|---|
| 459 | | |
|---|
| 460 | | @example |
|---|
| 461 | | @group |
|---|
| 462 | | ?\012 @result{} 10 ?\n @result{} 10 ?\C-j @result{} 10 |
|---|
| 463 | | ?\101 @result{} 65 ?A @result{} 65 |
|---|
| 464 | | @end group |
|---|
| 465 | | @end example |
|---|
| 466 | | |
|---|
| 467 | | To use hex, write a question mark followed by a backslash, @samp{x}, |
|---|
| 468 | | and the hexadecimal character code. You can use any number of hex |
|---|
| 469 | | digits, so you can represent any character code in this way. |
|---|
| 470 | | Thus, @samp{?\x41} for the character @kbd{A}, @samp{?\x1} for the |
|---|
| 471 | | character @kbd{C-a}, and @code{?\x8e0} for the Latin-1 character |
|---|
| 472 | | @iftex |
|---|
| 473 | | @samp{@`a}. |
|---|
| 474 | | @end iftex |
|---|
| 475 | | @ifnottex |
|---|
| 476 | | @samp{a} with grave accent. |
|---|
| 477 | | @end ifnottex |
|---|
| 478 | | |
|---|
| 479 | | A backslash is allowed, and harmless, preceding any character without |
|---|
| 480 | | a special escape meaning; thus, @samp{?\+} is equivalent to @samp{?+}. |
|---|
| 481 | | There is no reason to add a backslash before most characters. However, |
|---|
| 482 | | you should add a backslash before any of the characters |
|---|
| 483 | | @samp{()\|;'`"#.,} to avoid confusing the Emacs commands for editing |
|---|
| 484 | | Lisp code. You can also add a backslash before whitespace characters such as |
|---|
| 485 | | space, tab, newline and formfeed. However, it is cleaner to use one of |
|---|
| 486 | | the easily readable escape sequences, such as @samp{\t} or @samp{\s}, |
|---|
| 487 | | instead of an actual whitespace character such as a tab or a space. |
|---|
| 488 | | (If you do write backslash followed by a space, you should write |
|---|
| 489 | | an extra space after the character constant to separate it from the |
|---|
| 490 | | following text.) |
|---|
| 491 | | |
|---|