Show
Ignore:
Timestamp:
09/18/06 20:48:14 (2 years ago)
Author:
miyoshi
Message:

Sync up with Emacs CVS HEAD.

Files:

Legend:

Unmodified
Added
Removed
Modified
Copied
Moved
  • trunk/lispref/objects.texi

    r4131 r4166  
    228228example, the character @kbd{A} is represented as the @w{integer 65}. 
    229229 
    230   Individual characters are not often used in programs.  It is far more 
    231 common to work with @emph{strings}, which are sequences composed of 
    232 characters.  @xref{String Type}. 
     230  Individual characters are used occasionally in programs, but it is 
     231more common to work with @emph{strings}, which are sequences composed 
     232of characters.  @xref{String Type}. 
    233233 
    234234  Characters in strings, buffers, and files are currently limited to 
     
    240240Control, Meta and Shift. 
    241241 
     242  There are special functions for producing a human-readable textual 
     243description of a character for the sake of messages.  @xref{Describing 
     244Characters}. 
     245 
     246@menu 
     247* Basic Char Syntax:: 
     248* General Escape Syntax:: 
     249* Ctl-Char Syntax:: 
     250* Meta-Char Syntax:: 
     251* Other Char Bits:: 
     252@end menu 
     253 
     254@node Basic Char Syntax 
     255@subsubsection Basic Char Syntax 
    242256@cindex read syntax for characters 
    243257@cindex printed representation for characters 
     
    245259@cindex @samp{?} in character constant 
    246260@cindex question mark in character constant 
    247   Since characters are really integers, the printed representation of a 
    248 character is a decimal number.  This is also a possible read syntax for 
    249 a character, but writing characters that way in Lisp programs is a very 
    250 bad idea.  You should @emph{always} use the special read syntax formats 
    251 that Emacs Lisp provides for characters.  These syntax formats start 
    252 with a question mark. 
     261 
     262  Since characters are really integers, the printed representation of 
     263a character is a decimal number.  This is also a possible read syntax 
     264for a character, but writing characters that way in Lisp programs is 
     265not clear programming.  You should @emph{always} use the special read 
     266syntax formats that Emacs Lisp provides for characters.  These syntax 
     267formats start with a question mark. 
    253268 
    254269  The usual read syntax for alphanumeric characters is a question mark 
     
    316331constants; in string constants, just write the space. 
    317332 
     333  A backslash is allowed, and harmless, preceding any character without 
     334a special escape meaning; thus, @samp{?\+} is equivalent to @samp{?+}. 
     335There is no reason to add a backslash before most characters.  However, 
     336you should add a backslash before any of the characters 
     337@samp{()\|;'`"#.,} to avoid confusing the Emacs commands for editing 
     338Lisp code.  You can also add a backslash before whitespace characters such as 
     339space, tab, newline and formfeed.  However, it is cleaner to use one of 
     340the easily readable escape sequences, such as @samp{\t} or @samp{\s}, 
     341instead of an actual whitespace character such as a tab or a space. 
     342(If you do write backslash followed by a space, you should write 
     343an extra space after the character constant to separate it from the 
     344following text.) 
     345 
     346@node General Escape Syntax 
     347@subsubsection General Escape Syntax 
     348 
     349  In addition to the specific excape sequences for special important 
     350control characters, Emacs provides general categories of escape syntax 
     351that you can use to specify non-ASCII text characters. 
     352 
     353@cindex unicode character escape 
     354  For instance, you can specify characters by their Unicode values. 
     355@code{?\u@var{nnnn}} represents a character that maps to the Unicode 
     356code point @samp{U+@var{nnnn}}.  There is a slightly different syntax 
     357for specifying characters with code points above @code{#xFFFF}; 
     358@code{\U00@var{nnnnnn}} represents the character whose Unicode code 
     359point is @samp{U+@var{nnnnnn}}, if such a character is supported by 
     360Emacs.  If the corresponding character is not supported, Emacs signals 
     361an error. 
     362 
     363  This peculiar and inconvenient syntax was adopted for compatibility 
     364with other programming languages.  Unlike some other languages, Emacs 
     365Lisp supports this syntax in only character literals and strings. 
     366 
     367@cindex @samp{\} in character constant 
     368@cindex backslash in character constant 
     369@cindex octal character code 
     370  The most general read syntax for a character represents the 
     371character code in either octal or hex.  To use octal, write a question 
     372mark followed by a backslash and the octal character code (up to three 
     373octal digits); thus, @samp{?\101} for the character @kbd{A}, 
     374@samp{?\001} for the character @kbd{C-a}, and @code{?\002} for the 
     375character @kbd{C-b}.  Although this syntax can represent any 
     376@acronym{ASCII} character, it is preferred only when the precise octal 
     377value is more important than the @acronym{ASCII} representation. 
     378 
     379@example 
     380@group 
     381?\012 @result{} 10         ?\n @result{} 10         ?\C-j @result{} 10 
     382?\101 @result{} 65         ?A @result{} 65 
     383@end group 
     384@end example 
     385 
     386  To use hex, write a question mark followed by a backslash, @samp{x}, 
     387and the hexadecimal character code.  You can use any number of hex 
     388digits, so you can represent any character code in this way. 
     389Thus, @samp{?\x41} for the character @kbd{A}, @samp{?\x1} for the 
     390character @kbd{C-a}, and @code{?\x8e0} for the Latin-1 character 
     391@iftex 
     392@samp{@`a}. 
     393@end iftex 
     394@ifnottex 
     395@samp{a} with grave accent. 
     396@end ifnottex 
     397 
     398@node Ctl-Char Syntax 
     399@subsubsection Control-Character Syntax 
     400 
    318401@cindex control characters 
    319   Control characters may be represented using yet another read syntax. 
     402  Control characters can be represented using yet another read syntax. 
    320403This consists of a question mark followed by a backslash, caret, and the 
    321404corresponding non-control character, in either upper or lower case.  For 
     
    364447people who read it. 
    365448 
     449@node Meta-Char Syntax 
     450@subsubsection Meta-Character Syntax 
     451 
    366452@cindex meta characters 
    367453  A @dfn{meta character} is a character typed with the @key{META} 
     
    395481or as @samp{?\M-\101}.  Likewise, you can write @kbd{C-M-b} as 
    396482@samp{?\M-\C-b}, @samp{?\C-\M-b}, or @samp{?\M-\002}. 
     483 
     484@node Other Char Bits 
     485@subsubsection Other Character Modifier Bits 
    397486 
    398487  The case of a graphic character is indicated by its character code; 
     
    432521@end ifnottex 
    433522 
    434 @cindex unicode character escape 
    435   Emacs provides a syntax for specifying characters by their Unicode 
    436 code points.  @code{?\u@var{nnnn}} represents a character that maps to 
    437 the Unicode code point @samp{U+@var{nnnn}}.  There is a slightly 
    438 different syntax for specifying characters with code points above 
    439 @code{#xFFFF}; @code{\U00@var{nnnnnn}} represents the character whose 
    440 Unicode code point is @samp{U+@var{nnnnnn}}, if such a character 
    441 is supported by Emacs.  If the corresponding character is not 
    442 supported, Emacs signals an error. 
    443  
    444   This peculiar and inconvenient syntax was adopted for compatibility 
    445 with other programming languages.  Unlike some other languages, Emacs 
    446 Lisp supports this syntax in only character literals and strings. 
    447  
    448 @cindex @samp{\} in character constant 
    449 @cindex backslash in character constant 
    450 @cindex octal character code 
    451   Finally, the most general read syntax for a character represents the 
    452 character code in either octal or hex.  To use octal, write a question 
    453 mark followed by a backslash and the octal character code (up to three 
    454 octal digits); thus, @samp{?\101} for the character @kbd{A}, 
    455 @samp{?\001} for the character @kbd{C-a}, and @code{?\002} for the 
    456 character @kbd{C-b}.  Although this syntax can represent any @acronym{ASCII} 
    457 character, it is preferred only when the precise octal value is more 
    458 important than the @acronym{ASCII} representation. 
    459  
    460 @example 
    461 @group 
    462 ?\012 @result{} 10         ?\n @result{} 10         ?\C-j @result{} 10 
    463 ?\101 @result{} 65         ?A @result{} 65 
    464 @end group 
    465 @end example 
    466  
    467   To use hex, write a question mark followed by a backslash, @samp{x}, 
    468 and the hexadecimal character code.  You can use any number of hex 
    469 digits, so you can represent any character code in this way. 
    470 Thus, @samp{?\x41} for the character @kbd{A}, @samp{?\x1} for the 
    471 character @kbd{C-a}, and @code{?\x8e0} for the Latin-1 character 
    472 @iftex 
    473 @samp{@`a}. 
    474 @end iftex 
    475 @ifnottex 
    476 @samp{a} with grave accent. 
    477 @end ifnottex 
    478  
    479   A backslash is allowed, and harmless, preceding any character without 
    480 a special escape meaning; thus, @samp{?\+} is equivalent to @samp{?+}. 
    481 There is no reason to add a backslash before most characters.  However, 
    482 you should add a backslash before any of the characters 
    483 @samp{()\|;'`"#.,} to avoid confusing the Emacs commands for editing 
    484 Lisp code.  You can also add a backslash before whitespace characters such as 
    485 space, tab, newline and formfeed.  However, it is cleaner to use one of 
    486 the easily readable escape sequences, such as @samp{\t} or @samp{\s}, 
    487 instead of an actual whitespace character such as a tab or a space. 
    488 (If you do write backslash followed by a space, you should write 
    489 an extra space after the character constant to separate it from the 
    490 following text.) 
    491  
    492523@node Symbol Type 
    493524@subsection Symbol Type