Home

INTERACTIVE UNIX System V/386 R3.2 V3.0

image

Contents

1. 5 4 1 Defining Character Classification 4 2 When to Use the Character Classification Locale Category 4 3 Creating a Character Classification Category PREPARING AND INSTALLING A COLLATION Definition 4 3 1 4 3 2 4 3 3 An Example of a Character Classification Definition How a Program Uses This Information Use in Regular Expressions and Shell Pattern Matching SEQUENCE SPECIFYING NUMERIC AND MONETARY When to Use a Collation Sequence Defining Collation Capabilities Creating a Collation Sequence Definition collating element Keyword collating symbol Keyword substitute Keyword order start Keyword order end Keyword Use in Regular Expressions and Shell Pattern 5 4 1 charmap Files 5 5 Source File Organisation 5 5 1 5 5 2 5 5 3 5 5 4 5 5 5 5 5 6 An Example 5 5 7 Matching INFORMATION 6 1 6 2 6 3 6 4 6 5 Reasons for Defining Numeric and Monetary Formatting Defining Numeric dd Monetary Formatting When to Use the Numeric and Monetary locale Category Numeric Editing Creating a Numeric Category Definition decimal point Keyword thousands sep Keyword grouping Keyword ss An Example of a Numeric Category 6 5 1 6 5 2 6 5 3 6 5 4 Definition li 15 15 15 17 17 17 6 5 5 How a Program Uses This Information 6 6 Monetary Editing 6 7 Creating a Monetary Category Definition 6 7 1 6 7 2 6 7 3 6
2. printf n to display all letters and symbols that you can use on the console and the number by which they are represented inside the computer International Supplement User s Manual 21 If you are not familiar with the C language follow these instruc tions to compile and run this program 1 Use an editor to create a file with a name that ends in for example show c and insert the exact text of the program 2 For example to create show c type make show 3 Then to run the program type show Historically the eighth bit of the byte that is used to store charac ters was used by the UNIX Operating System and its utilities for a variety of purposes It could be used in a sorting algorithm to see if a character was already processed or when a program allocated bytes of memory to indicate that the byte was already used In communication software across telephone lines which are not 100 percent reliable the eighth bit was used to do additional checking by forcing the software to always use either even or odd values for the number represented by the byte to send across the wire This bit was then called a parity bit Most utilities provided with the UNIX Operating System were care less enough to ignore the value of this last bit preventing the use of characters with the 8 bit set such as the ones displayed when run ning the program listed above usually referred to as 8 bit charac ters Utilitie
3. e There is a symbol for almost every code in the second half of this codeset e The symbols consist of accented letters both uppercase and lowercase special symbols and graphics characters to draw lines and boxes e For some lowercase accented characters there are no uppercase equivalents for example amp Many personal computer programmers and applications use the graphics characters to draw straight lines draw boxes around text and so on This codeset clearly supports most characters used in the major Western European languages such as French and German In recent years alternate codesets were developed for personal comput ers and software was developed to change the codeset used by them when running DOS Software to support this was developed for the INTERACTIVE UNIX Operating System as well In the DOS world the name codepage was used and the popular IBM extended ASCII codeset is now called IBM codepage 437 The introduction of additional codesets supports more languages spoken in a particular territory A list of some of the existing IBM codepages and the targeted area or language includes International Supplement User s Manual 23 437 Territory or Language U S English and Western Europe International codepage supports more letters and fewer graphics characters than codepage 437 Canada Norway Denmark Supports Russian alphabet This list is incomplete there are codepages
4. 4 International Supplement User s Manual This was a step in the right direction but it was incomplete because it described only the interfaces to the operating system In 1985 the X Open Company Limited published the X Open Por tability Guide XPG It basically listed the SVID as its first chapters but also included a description of the C language the COBOL language how to interface with databases and other infor mation It is important to note that the X Open Company always adopted standards where they existed as opposed to creating new ones Where standards were missing for example for interna tionalisation they recommended standards 3 3 Common Applications Environment Now more than five years later the third issue of the X Open Por tability Guide XPG3 is accepted by most governments and major corporations as the bible of the computer industry Published in 1989 it consists of seven volumes describing the Common Applica tions Environment CAE defined by the X Open Company and built on top of the interfaces of the UNIX Operating System cover ing other aspects required for a comprehensive applications inter face The portion that discusses the operating system and its utilities is referred to as the X Open System Interface XSI The seven volumes are XSI Commands and Utilities XSI System Interfaces and Headers e XSI Supplementary Definitions e Programming Languages e Data Management e Window Mana
5. C 5 3 strerror 3P NAME strerror error message strings SYNOPSIS include lt string h gt char strerror errnum int errnum DESCRIPTION The strerror function maps the error number in errnum to a language dependent error message string and returns a pointer to it The string pointed to will not be modified by the program but may be overwritten by a subsequent call to the strerror function In this implementation strerror obtains the error message strings from a message catalogue named libc cat If such a message catalogue is not found in NLSPATH see environ SP then the system default catalogue lib locale ISC msgcat libc cat which contains the English version of the error messages will be used RETURN VALUE Upon successful completion strerror returns a pointer to the generated message string No return value is reserved to indicate an error ERRORS The strerror function may fail if EINVAL The value of errnum is not a valid error message number SEE ALSO perror 3P environ 5P in the INTERACTIVE SDS Guide and Programmer s Reference Manual NOTE TO USERS This entry is reprinted from the INTERACTIVE SDS Guide and Programmer s Reference Manual INTERACTIVE UNIX System 1 International Supplement strxfrm 3P strxfrm 3P NAME strxfrm string transformation SYNOPSIS include lt string h gt size_t strxfrm si s2 n char 51 s2 size_t n DESCRIPTION The strxfrm
6. INTERACTIVE UNIX System 1 International Supplement gencat 4P gencat 4P NAME gencat format of message text source file used as input to gencat 1P DESCRIPTION This entry supplies the format of a message text source file as defined by the X Open Portability Guide Volume 3 XSI Supplementary Definitions Section 5 2 1 Message Text Source Files The follow ing symbolic constant values are found in usr include sys limits h and usr include nl types h respectively Symbolic NL SETMAX 255 NL_MSGMAX 32767 NL_TEXTMAX 1023 NL_SETD 1 The format of a message text source file is defined as follows Note that the fields of a message text source line are separated by a single ASCII space or tab character Any other ASCII spaces or tabs are considered as being part of the subsequent field set n comment This line specifies the set identifier of the messages that follow until the next Sset delset or end of file appears The n denotes the set identifier which is defined as a number in the range 1 NL SETMAX Set identifiers must be presented in ascending order within a single source file but need not be contiguous Any string following the set identifier is treated as a comment If no set directive is specified in a message text source file all messages will be located in an implementation defined default message set NL SETD delset n comment This line deletes message set n from an existing message cat
7. Keyboards to be used in France and Switzerland require special attention On French keyboards the key must be used to access the digits printed on the top row A Swiss keyboard can be used in two modes It has keys with four characters printed on it the same two characters are printed twice but in opposite order In German Swiss mode German characters like 6 are accessed by pressing a key French ones like 4 by using the key as well In French Swiss mode it works the opposite way 4 4 Cyrillic or Greek Keyboards Certain languages such as Greek or Russian use completely different alphabets sometimes referred to as Cyrillic Although they may look similar the Russian and Greek alphabets do differ What International Supplement User s Manual 15 they have in common is the fact that they consist of a reasonably small set of letters 31 for Russian and that although some of the letters also exist in English all of these letters are considered separate from the English set A personal computer keyboard that supports these languages is designed differently than the ones dis cussed in the previous section The remainder of this section discusses a keyboard designed to sup port both U S English and Russian use with Greek is theoretically the same A U S English Russian keyboard other variants such as German Russian keyboards exist is physically identical to U S English keyboards The only difference is that in addition to the
8. Refer to section 8 TIPS FOR PROGRAMMERS for more information ff International Supplement Manual for Advanced Users 15 4 SPECIFYING CHARACTER CLASSIFICATION INFORMATION The character classification category determines classification of characters as letters digits and so on as well as some other infor mation about the codeset and character set used The default char acter classification only recognises the 26 ASCII letters as such which means that any program processing non English text that depends on the classification will behave incorrectly For example take vi which prints nonprintable characters using an octal nota tion For vi to correctly display non ASCII characters you must change the character classification Another example is programs that do uppercase to lowercase conversion the standard table han dles only ASCII 4 1 Defining Character Classification These definitions are created by placing a specification in the LC CTYPE file in a locale directory This specification is out put by the chrtbl utility refer to chrtbl 1M The created table should also be copied to the 1ib chrclass directory 4 2 When to Use the Character Classification locale Category The created and installed definitions are not activated until the user specifies that they should be used To do this the user must set the LC ALL LC CTYPE or LANG environment variable to the direc tory in which the files are stored Thi
9. e Format of the date display e Format of the combined date and time display e Format of the 12 hour time display e Names of the days of the week e Abbreviated names of the days of the week e Names of the months e Abbreviated names of the months e Format of the ante meridiem and post meridiem strings used in 12 hour clock time displays International Supplement Manual for Advanced Users 11 Note that the standard INTERACTIVE UNIX System library rou tine strftime refer to ctime 3P is set up to use this informa tion The System V c time routine on the other hand does not use the information created in this manner it uses a different shell variable and searches in a different directory refer to section 9 THE SYSTEM V ENVIRONMENT in the International Supple ment User s Manual for more information 3 3 Creating a Date and Time Formatting Definition The source language for the date and time category in the INTER ACTIVE UNIX System is the language defined by the POSIX 2 group for the LC TIME locale category A date and time editing source definition consists of a header a date and time editing body and a trailer The header consists of the word LC TIME The trailer consists of the string END LC TIME The date and time editing body consists of one or more lines of text Each line contains a keyword followed by one or more operands Keywords are separated from the operands by one or more blank characters
10. Character Codesets and Text Transfer Question 18 Does the implementation use the ISO 8859 1 1987 as its internal codeset Answer The implementation does not prescribe a specific internal codeset Any single byte codeset that is a true superset of ISO 646 IRV including ISO 8859 1 1987 can be used as the internal codeset Rationale The XPG defines the ISO 8859 1 1987 as the major Western Euro pean transmission codeset and also recommends its use as the corresponding internal codeset Reference XPG3 Volume 3 Page 19 Character Codesets and Text Transfer Page 2 5 1 Conformance Statement XCS QUE 3 2 Questionnaire 2 5 2 Regular Expression Interfaces Question 19 What form of regular expression syntax is supported by the regexp interface Answer Simple Internationalised assuming this is in regard to the regexp h interface Rationale The regexp interface may support either the simple regular expres sion or the simple internationalised regular expression syntax as defined in the XPG3 Volume 3 Supplementary Definitions Reference XPG3 Volume 3 Pages 49 51 Regular Expressions Page 2 5 2 Conformance Statement XCS QUE 3 2 Questionnaire Chapter 3 Commands and Utilities Product Identification Product Identification INTERACTIVE UNIX System V 386 Release 3 2 Version Release No 3 0 If you do not supply this component yourself please identify below the supplier you
11. Decimal notation A decimal constant must be specified as the escape character followed by a d followed by one two or three decimal digits For example a99 a231 77 97 121 5 4 1 charmap Files The colldef processor as well as the iconv utility can use the information stored in a charmap file Refer to iconv 1P for more information These files are used to document the supported codesets Each character in the coded character set is described with a symbolic name and the character encoding The following is an excerpt from the charmap file describing IBM codepage 437 Refer to charmap 5P for more information 24 International Supplement Manual for Advanced Users lt C cedilla gt d128 LATIN CAPITAL LETTER C WITH CEDILLA lt u diaeresis gt a129 LATIN SMALL LETTER U WITH DIAERESIS lt e acute gt d130 LATIN SMALL LETTER A WITH ACUTE lt a circumflex gt d131 LATIN SMALL LETTER A WITH CIRCUMFLEX lt a diaeresis gt d132 LATIN SMALL LETTER A WITH DIAERESIS lt a grave gt 133 LATIN SMALL LETTER A WITH GRAVE lt a ring gt d134 LATIN SMALL LETTER A WITH RING ABOVE lt c cedilla gt d135 LATIN SMALL LETTER C WITH CEDILLA a lt e circumflex gt d136 LATIN SMALL LETTER E WITH CIRCUMFLEX lt e diaeresis gt d137 LATIN SMALL LETTER E WITH DIAERESIS lt e grave gt d138 LATIN SMALL LETTER E WITH GRAVE lt i diaeresis gt d139 LATIN SMALL LETTER I WITH DIAERESIS lt i circumflex gt d140 LATIN SMALL LETTER I WITH
12. Due to limitations in the MIT code of X11 Release 4 key sequences and deadkeys cannot be supported when X based applications are run The one exception to this however is when text based applications are used in an xpcterm window These applications have access to the tty system so ttymap can then be used to define deadkeys or compose sequences International Supplement User s Manual 19 5 STORING DATA IN THE COMPUTER The previous section explained how keyboards are used to generate letters and other characters on a computer running the INTER ACTIVE UNIX Operating System Typically these characters are processed by the application that is currently running it could be the shell which is the command interpreter or an editor or any other application In most cases the characters are echoed on the screen Applications such as editors vi or e the TEN PLUS editor for example store these characters in a file As mentioned earlier a computer speaks no particular language and has no notion of what a letter is It stores numbers in the file rather than letters Unless every computer system uses the same number to store a certain letter files created on one computer cannot be read on another Most computer manufacturers use the same convention to represent characters internally however some differences in standards do exist For example many IBM computers not PCs use a standard called EBCDIC The UNIX Operating System was de
13. English letters the Russian letters are also pictured on the keycaps usually in a different color see Figure 2 Using ttymap the keyboard is mapped to generate Russian characters when a key is pressed A special key called a toggle key can be used within an application to switch between Russian and English The default sequence for toggling between languages is F2 This feature of the INTERACTIVE UNIX tty system and the ttymap utility has been especially designed to support languages such as Greek and Russian The same toggle key can be used with European keyboards to temporarily cause deadkeys to no longer act like deadkeys for example A French programmer might decide to use the toggle key when he switches between a C source code file and a French text file for example International Supplement User s Manual Figure 2 English Russian Personal Computer Keyboard Layout International Supplement User s Manual 17 4 5 Keyboard Layouts on 7 bit Terminals The keyboards described so far are keyboards that are attached to devices capable of supporting 256 different symbols Certain termi nals only support up to 128 different symbols The national key boards supplied with these terminals sacrifice some of the symbols such as and V although these are very useful in the context of the UNIX Operating System and replace them with local language characters The terminal itself usually has a key that allows the user to specify
14. SYNOPSIS include lt nl_types h gt char catgets catd set_id msg_id s nl catd catd int set id msg id char s DESCRIPTION The catgets function attempts to read message msg id in set set id from the message catalogue identified by catd The catd argument is a message catalogue descriptor returned from an earlier call to catopen 3P The s argument points to a default message string that will be returned by catgets if it cannot retrieve the identified message RETURN VALUES If the identified message is retrieved successfully catgets returns a pointer to an internal buffer area containing the null terminated mes sage string If the call is unsuccessful for any reason s is returned ERRORS No errors are defined SEE ALSO catopen 3P s NOTE TO USERS t This entry is reprinted from the INTERACTIVE SDS Guide and Programmer s Reference Manual INTERACTIVE UNIX System 1 International Supplement J 3 catopen 3P NAME catopen open a message catalogue SYNOPSIS include nL types h nl catd catopen name oflag char name int oflag DESCRIPTION The catopen function opens a message catalogue and returns a message catalogue descriptor The name argument specifies the name of the message catalogue to be opened If name contains a slash then name specifies a complete name for the message catalogue Otherwise the environment variable NLSPATH is used with name substi
15. and may contain spaces The word SIZE followed by the point size of the characters the x resolution and the y resolution of the font The sizes are not verified by loadfont but the line containing this key word needs to be there for compatibility purposes The word FONTBOUNDINGBOX followed by the width in x height in y and the x and y displacement of the lower left hand corner from the origin Again the sizes are not verified by loadfont but this line containing the keyword needs to be there for compatibility purposes Optionally the word STARTPROPERTIES followed by the number of properties that follow If present the number needs to match the number of lines following this one before the INTERACTIVE UNIX System 1 International Supplement loadfont 4 loadfont 4 occurrence of a line beginning with ENDPROPERTIES These lines consist of a word for the property name followed by either an integer or string surrounded by double quotes Pro perties named FONT ASCENT FONT DESCENT and DEFAULT CHAR are typically present in BDF files to define the logical font ascent and font descent and the default char for the font As mentioned above this section if it exists is terminated by ENDPROPERTIES The word CHARS followed by the number of characters that follow This number should always be 256 This terminates the part of the oadfont input file describing features of the font in general The rest of the file contains descr
16. as the radix character in scripts Page 3 3 1 X Open Conformance Statement XCS QUE 3 2 Questionnaire Command Behaviour Specified in XPG3 Supported comm LC COLLATE affects sorting sequence Yes cp ln mv LANG affects yes string Yes cpio LC_COLLATE LC_CTYPE affect filename Yes pattern matching LC_TIME affects date format Yes date LC_TIME affects date formatting options Yes ed red LC_COLLATE LC_CTYPE affect regular Yes expression matching LC_CTYPE is used to determine whether Yes characters are printable egrep LC_COLLATE LC_CTYPE affect regular Yes expression matching LC_CTYPE is used to determine character Yes classification alphabetic upper case lower case expr LC_COLLATE LC_CTYPE affect regular Yes expression matching LC_COLLATE affects the behaviour of Yes relational operators fgrep LC_CTYPE is used to determine character Yes classification alphabetic upper case lower case find LANG affects yes string Yes LC_COLLATE LC_CTYPE affect filename Yes pattern matching grep LC_COLLATE LC_CTYPE affect regular Yes expression matching Page 3 3 2 X Open Conformance Statement XCS QUE 3 2 Questionnaire Command Behaviour Specified in XPG3 Supported LC CTYPE is used to determine character Yes classification alphabetic upper case lower case join LC COLLATE affects sorting sequence Yes lpstat LC TIME affects date format Yes ls LC COLLATE affects sorting sequence Yes LC CTYPE is used to determine whether a Yes charact
17. codeset is used different numbers represent the characters To keep track of this the system uses a classification table which contains information about all 256 characters in the codeset Things that can be specified are e Lowercase letters e Uppercase letters e Digits e White space characters e Punctuation characters e Control characters e Uppercase to lowercase conversion e Lowercase to uppercase conversion e Printable characters or nonprintable characters Programs that are written to use functions like isupper and isdigit refer to ctype 3C access this table and behave accordingly The default table used by the system is the ASCII table that considers every 8 bit character nonprintable This explains why programs such as vi do not display 8 bit characters correctly but their octal representations instead unless the proper environ ment is set up Using more than just the ASCII characters changes the meaning of many things including the meaning of regular expressions The string a z no longer represents all lowercase characters In some languages there are alphabetic characters after z in the dic tionary and as discussed earlier most codesets contain lowercase characters that are stored as 8 bit characters which would be ignored if the above expression were evaluated numerically The X Open Portability Guide specifies internationalised regular expressions It introduces keywords that can be used to specify clas
18. enclosed between angle brackets The encoding part must be expressed as a decimal octal or hexade constant in the following formats the represents the escape character INTERACTIVE UNIX System 2 International Supplement charmap 5P charmap 5P dnnn decimal value xnn hexadecimal value nnn octal value Decimal constants are represented by two or three decimal digits pre ceded by the escape character and the lowercase letter d for example d97 or d143 Hexadecimal constants are represented by two hexa decimal digits preceded by the escape character and the lowercase letter x for example x61 or x8f Octal constants are represented by two or three octal digits preceded by an escape character for example 141 or 217 Example of part of a charmap file CHARMAP lt NUL gt 000 lt newline gt 12 lt percent sign gt x25 lt one gt d048 lt A gt a065 lt A acute gt a193 END CHARMAP EXTENDED_CHARMAP NOTES The INTERACTIVE UNIX System does not support multi byte coded character sets However certain common codesets such as ISO 6937 define certain accented letters as combinations of two bytes dead key sequences As an example the letter lt A acute gt may be represented by a two byte sequence the first byte representing the accent and the second the base letter The iconv utility requires that such characters be defined in the charmap They must be defined
19. pared based on the primary weight If they are equal and more than one weight has been assigned then the strings are compared again and again until the strings either compare unequally or the weights are exhausted Comparisons may proceed either from the beginning of the strings toward the end or from the end toward the beginning 5 4 Creating a Collation Sequence Definition The source language for collation definitions in the INTERACTIVE UNIX System is the language specified by the POSIX 2 group for the LC COLLATE locale category A collation sequence definition describes the relative order among collating elements characters and multicharacter collating ele ments in the locale This order is expressed in terms of colla tion values or weights by assigning each element one or more colla tion values The collation sequence definition is used by regular expressions pattern matching and sorting A collation source definition consists of a collation header a colla tion body and a collation trailer The collation header is the word LC COLLATE The collation trailer is string END LC COLLATE The collation body consists of one or more lines of text each of which contains an identifier optionally followed by one or more operands Identifiers are either keywords or collating elements Identifiers are separated from the operands by one or more blank characters space or tab Operands are characters collating elements or stri
20. sign for a non negative formatted mone tary quantity Set to a value indicating the positioning of the negative sign for a negative formatted monetary quantity END LC MONETARY Example LC MONETARY int curr symbol currency symbol mon decimal point mon thousands sep mon grouping negative sign int frac digits frac digits p cs precedes p sep by space n cs precedes n sep by space n sign posn END LC MONETARY LC_NUMERIC This keyword must be the last in the file USD The information in the LC NUMERIC file is in text format Each line in the text file contains a keyword and a value separated by space s or tab s Lines starting with a are ignored The following keywords are recognized LC NUMERIC INTERACTIVE UNIX System This keyword must be the first in the file 4 International Supplement 1 1 5 decimal point thousands sep grouping locale 5P The value is the character to be used as decimal delimiter it may be enclosed in quotation marks The value is the character used as the thousands separator it may be enclosed in quotation marks The value is a string of semicolon separated numbers as described in ocaleconv 3P END LC NUMERIC This keyword must be the last in the file Example LC NUMERIC decimal point thousands sep grouping END LC NUMERIC LC_TIME The information in 33330 the LC_TIME file is in text format Each line in the tex
21. strtol EINVAL Yes ERANGE Yes strxfrm EINVAL Yes unlink ETXTBSY Yes Rationale Each of the above error conditions is marked as optional in the XPG and an implementation may return this error in the circumstances specified or may not provide the error indication Those items marked with a t are also considered to be optional error conditions in POSIX 1 The EINVAL error condition for the three functions sigaddset sigdelset and sigismember are mandated in the XPG but are considered optional in POSIX 1 An X Open conforming implementation will always produce these errors but a POSIX 1 conforming implementation may not Page 2 1 8 X Open Conformance Statement XCS QUE 3 2 Questionnaire Reference XPG3 Volume 2 Page 32 Error Numbers 2 1 5 Mathematical Interfaces Question 6 What format of floating point numbers are supported by this implementation Answer IEEE floating point format Options 1 IEEE floating point format 2 Description of floating point format supported Rationale Most implementations support IEEE floating point format either in hardware or software Some implementations support other formats with different exponent and mantissa accuracy These differences need to be defined Question 7 Js long double form supported and what precision is associ ated with this form Answer Not supported Long double equates to double Options 1 Not supported Long double equates to double 2 De
22. the XPG points out the differences related to the trunca tion of negative numbers Reference XPG Volume 4 Page 10 Conversions Question 3 What truncation rules are applied when using the division operator and either of the operands is negative Answer Truncation toward Zero Rationale The XPG states that such truncations are machine dependent Reference XPG Volume 4 Page 16 Expressions Page 4 2 1 X Open Conformance Statement XCS QUE 3 2 Questionnaire Chapter 15 Source Code Transfer Section 15 1 Utilities Product Identification Product Identification INTERACTIVE UNIX System V 386 Release 3 2 Version Release No 3 0 If you do not supply this component yourself please identify below the supplier you reference 15 1 1 Conformance Reference Indicator of Compliance None Environment Specification Enter below details of the hardware and software environment in which conformance is claimed including compilation routines and installation procedures if any Sufficient detail must be supplied to enable conformant behaviour to be reproduced Any 386 486 compatible system with at least 4 MB of RAM and the following INTERACTIVE UNIX System V 386 Release 3 2 Version 3 0 subsets and extensions installed approximately 40 MB of disk space is needed Page 15 1 1 X Open Conformance Statement XCS QUE 3 2 Questionnaire Core Kernel Configuration File Management International Supplement RES
23. 14 and 16 All BBX lines of the subsequent characters should list the same height and width as the first one because only fixed size fonts are supported The optional word ATTRIBUTES followed by the attributes as 4 hex encoded characters The oadfont utility will accept this line if present but there is no meaning attached to it The word BITMAP which indicates the beginning of the bit map representation of the character This line should be fol lowed by height lines height as specified in the BBX line INTERACTIVE UNIX System 2 International Supplement loadfont 4 Example loadfont 4 representing a hex encoded bitmap of the character one byte per line The word ENDCHAR indicating the end of the bitmap for this character After all the bitmaps the end of the file is indicated by the ENDFONT keyword The following example lists the beginning of the oadfont input file for an 8 by 16 font supporting the IBM 437 codeset as well as the bitmap representation of the character uppercase A STARTFONT 2 1 FONT 8x16 SIZE 16 75 75 FONTBOUNDINGBOX 8 16 0 4 STARTPROPERTIES 3 FONT DESCENT 4 FONT ASCENT 12 DEFAULT CHAR 0 ENDPROPERTIES CHARS 256 STARTCHAR C0000 ENCODING 0 Bitmap for uppercase A character STARTCHAR C0041 ENCODING 65 SWIDTH 666 0 DWIDTH 8 0 BBX 8 16 0 4 BITMAP 00 00 10 38 6c c6 c6 fe c6 c6 c6 c6 00 00 00 00 ENDCHAR INTERACTIVE UNIX System 3 International Suppleme
24. 7 4 6 7 5 6 7 6 6 7 7 6 7 8 6 7 9 6 7 10 6 7 11 6 7 12 6 7 13 int curr symbol Keyword currency symbol Keyword mon decimal point Keyword mon _ thousands sep Keyword mon grouping Keyword positive gae A sign Keywords int frac digits Keyword frac digits Keyword p cs precedes n cs PPA ERAS Keywords p sep by space n_ sep_ by space Keywords p sign posn n sign DoS Keywords An Example of a Monetary Category Definition How a Program Uses This Information e SPECIFYING YES NO RESPONSE INFORMATION 8 7 1 Reasons for Defining Yes No Responses 7 2 Defining Yes No Responses 7 3 When to Use the Yes No Response locale Category 7 4 Creating a Yes No Response Category Definition 7 4 1 7 4 2 7 4 3 7 4 4 yesexpr Keyword noexpr Keyword An Example of a SD Category Definition Uses This Information s o TIPS FOR PROGRAMMERS 8 1 Character Mapping 8 2 Giving Programs Access to locales 8 3 Date and Time iii 40 40 40 41 41 42 42 42 42 43 43 43 44 44 45 45 Qo 90 90 90 00 CANN Character Classification Collation v cet s Regular Expressions Numeric and Monetary Formatting Message Catalogues 8 8 1 Extension of printf Syntax International Supplement Manual for Advanced Users 1 INTRODUCTION This document explains how to
25. 7 4 2 noexpr Keyword This keyword specifies the character or string to use as the negative no response The format is noexpr regular ex pression where regular expression is a regular expression which when used to match negative responses will report a match 7 4 3 An Example of a Response Category Definition LC MESSAGES Yy noexpr Nn on END LC MESSAGES 7 4 4 How a Program Uses This Information If a program needs to access the values in the current locale it can do so via the nl langinfo library interface Refer to nl_langinfo 3P for more information 44 International Supplement Manual for Advanced Users 8 TIPS FOR PROGRAMMERS This section is written for programmers who want to take advantage of the INTERACTIVE Software Development System capabilities that support features that deal with internationalisation in particu lar those described in the X Open Portability Guide It is not designed as a programmer s guide but simply points programmers to the appropriate references where these features are described Manual entries that deal with the features appear in the Interna tional Supplement Reference Manual and in the INTERACTIVE SDS Guide and Programmer s Reference Manual To be able to use all the features described programs should always be compiled and linked using the Xp option and contain the following line in the source file before the inclusion of any header files define
26. English language For most languages in the world however this is not enough Most European languages contain more letters than the 26 in the English language with the additional letters typically collating between the letters in the ASCII set For example an 4 sorts between a and b 20 International Supplement Manual for Advanced Users The European user expects sorted lists for instance the output from the 1s command to appear in the collation order of his or her language The INTERACTIVE UNIX Operating System provides users with the ability to define their own collation order This capability is a superset of the X Open requirement for an internationalised sys tem and it is expected to satisfy the requirements for dictionary ordering for most European languages and non European alphabetic languages 5 3 Capabilities The following capabilities are provided 1 Multicharacter collating elements The term collating element is used to describe the basic enti ties that are compared in collation All characters in the character set are automatically collating elements In addi tion the user can define multicharacter collating elements sequences of two or more characters to be collated as a single entity For example the Spanish ch collates as an entity between c and d 2 User defined ordering of collating elements The user has complete control over the order in which charac ters and multicharacter collating elements
27. For example the default decimal delimiter is a period but in most European countries the comma is used instead By defining numeric and monetary formatting with the correct values programs display fractions using the appropriate decimal delimiter 6 2 Defining Numeric and Monetary Formatting These definitions are created by placing a specification in the appropriate file either LC NUMERIC or LC MONETARY in a locale directory 6 3 When to Use the Numeric and Monetary 1ocale Category The created and installed definitions are not activated until the user specifies that they should be used The user must set the LC NUMERIC environment variable to the directory in which that file is stored and the LC MONETARY environment variable to the directory in which that file is stored Alternately the user can set the LC ALL or LANG environment variable to the directory to specify both This must be done before a program using the stored definitions is executed Note that the program must be set up to check and set the international environment via the setlocale function In the INTERACTIVE UNIX System the standard utili ties that depend on numeric editing such as awk have been modified to use the international environment 6 4 Numeric Editing Numeric editing controls the appearance of nonmonetary numbers as well as the input format The following three aspects of numeric editing are controlled via the LC NUMERIC locale category 34 I
28. Greek use a completely different set of characters For these languages collation takes on additional complexities The INTERACTIVE UNIX Operating System allows users to define their own collation order This capability is a superset of the X Open requirement for an internationalised system and is expected to satisfy the requirements for dictionary ordering for most Euro pean languages and non European alphabetic languages The stan dard utilities that depend on collation such as sort and 1s have been modified to understand this user specified collation order and are supplied with the International Supplement 8 3 1 An Example Consider the following four lines the four seasons in French printemps t automne hiver The regular UNIX System sort utility sorts them as follows International Supplement User s Manual 35 automne hiver printemps t It uses the numeric representation of characters and because is represented by an 8 bit character it is listed last The UNIX Sys tem sort used to strip the eighth bit sorting the above sequence as t automne hiver printemps which is of course wrong as well Making utilities 8 bit clean is not always sufficient The internationalised sort gives the following correct result automne ete hiver printemps 8 4 Numeric and Monetary Formatting The default conventions for decimal delimiter and other numeric formatting rules are seldom correct in an
29. INTERACTIVE Software Development System 1600 bpi PE magnetic tape is supported with the INTERACTIVE UNIX Operating System when using a controller card and a tape unit for which a device driver is available Several vendors pro vide such hardware software Temporary Waivers List below references to any temporary waivers granted by X Open in respect of minor errors in the product referenced above This should include the X Open reference and the waiver expiry date The waivers as granted shall be made available with this document on request Page 15 1 2 X Open Conformance Statement XCS QUE 3 2 Questionnaire Formats Question 1 Which exchange media format s may be written by the system Answer 80 track diskettes Yes 40 track diskettes Yes 1600bpi PE magnetic tape Yes Rationale XPG3 states that standards are referenced for transfer of diskettes and magnetic tapes between machines Because of the different nature of X Open conformant systems it is not possible to define a single portable medium that is supported across the whole range of systems Reference XPG3 Volume 3 Chapters 15 16 and 17 Question 2 Which exchange media format s may be read by the system Answer 80 track floppy disk Yes 40 track floppy disk Yes 1600bpi PE magnetic tape Yes Rationale XPG 3 states that standards are referenced for transfer of diskettes and magnetic tapes between machines Because of the different nature of X
30. MON_9 String for formatting date and time String for formatting of date String for formatting of time Ante Meridiem abbreviation Post Meridiem abbreviation Name of the first day of the week e g Sunday Name of the second day of the week e g Monday Name of the third day of the week e g Tuesday Name of the fourth day of the week e g Wednesday Name of the fifth day of the week e g Thursday Name of the sixth day of the week e g Friday Name of the seventh day of the week e g Saturday Abbreviated name of the first day of the week Abbreviated name of the second day of the week Abbreviated name of the third day of the week Abbreviated name of the fourth day of the week Abbreviated name of the fifth day of the week Abbreviated name of the sixth day of the week Abbreviated name of the seventh day of the week Name of the first month of the year e g January Name of the second month of the year e g February Name of the third month of the year e g March Name of the fourth month of the year e g April Name of the fifth month of the year e g May Name of the sixth month of the year e g June Name of the seventh month of the year e g July Name of the eighth month of the year e g August Name of the ninth month of the year e g September INTERACTIVE UNIX System 1 International Supplement langinfo 5P langinfo 5P MON 10 Name of the tenth mo
31. Numeric and Monetary Formatting printf and other functions have been modified to use numeric formatting It is accessed using the statement setlocale LC NUMERIC International Supplement Manual for Advanced Users 47 in the program Although no functions currently use monetary for matting applications can do so by using the statement setlocale LC MONETARY in the program Note that using LC ALL is sufficient to do the job for all locale categories When the value of one of the numeric or monetary conventions is needed in the flow of the program the localeconv function can be used It returns a data structure containing all the relevant values Refer to ocaleconv 3P for more information 8 8 Message Catalogues Three functions should be used to write programs that use message catalogues rather than hardcoded text e catopen This function takes two arguments the second of which should always be zero The first argument name of type charx specifies the name of the message catalogue to be opened If name contains a slash it specifies a complete name for the message catalogue Otherwise the environment variable NLSPATH is used with name substituted for N refer to environ 5P for the description of NLSPATH If NLSPATH does not exist in the environment or if a message catalogue can not be opened in any of the components specified by NLSPATH then the default used by this implementation is lib locale IS
32. Open Conformance Statement XCS QUE 3 2 Questionnaire Chapter 2 Chapter 3 die Chapter 4 Chapter 15 Contents Internationalised System Calls and Libraries Section 2 1 Section 2 2 Section 2 3 Section 2 4 Section 2 5 General Attributes Process Handling File Handling General Terminal Interface Internationalised System Interfaces Commands and Utilities Section 3 1 Section 3 2 Section 3 3 C Language Basic Utilities Development Utilities Internationalisation Option Source Code Transfer Q FTN X Open Conformance Statement XCS QUE 3 2 Questionnaire Chapter 2 Internationalised System Calls and Libraries Product Identification Product Identification INTERACTIVE UNIX System V 386 Release 3 2 Version Release No 3 0 If you do not supply this component yourself please identify below the supplier you reference Conformance Reference Indicator of Compliance VSX Test Suite Release 3 204 Testing Agency Name UniSoft Corporation Address 6121 Hollis Street Emeryville CA 94608 2092 Environment Specification Enter below details of the hardware and software environment in which testing took place including compilation routines and in stallation procedures if any Sufficient detail must be supplied to enable conformant behaviour and any test results to be reproduced Any 386 486 compatible system with at least 4 MB of RAM and with the following INTERACTIVE UNIX System V 386
33. Open conformant systems it is not possible to define a single portable medium which is supported across the whole range of systems In addition some systems can read a wider range of formats that they can write Page 15 1 3 Conformance Statement XCS QUE 3 2 Questionnaire Reference XPG3 Volume 3 Chapters 15 16 and 17 Utilities Question 3 Which utilities are used to create and read the archive for mats specified in XPG Volume 3 XSI Supplementary Definitions Answer Format Creating Reading Extended tar tar tar cpio cpio cpio Options A definition of the commands used to create and read these formats If a special option is required to produce the specified format this must be detailed Refer to POSIX 1 Conformance Document Section 10 1 Rationale There is no explicit definition as to the commands that must be used to create and retrieve these archives On most systems this will be achieved by the tar and cpio commands There are other commands available that produce these archives On some implementations the command may need a special option to enable reading of the specified formats with the standard option being to create archives which are backwards compatible with previous versions of the command Reference XPG3 Volume 3 Page 151 2 Utilities Page 15 1 4 om X Open Conformance Statement XCS QUE 3 2 Questionnaire Invalid File Names Question 4 What file name is used to cont
34. QUE 3 2 Questionnaire internationalised regular expressions for all of the above utilities It should be noted that the sdb command is an optional development utility and may not be available on all XPG conforming systems Reference XPG3 Volume 3 Pages 49 51 Regular Expressions Page 3 3 6 X Open Conformance Statement XCS QUE 3 2 Questionnaire Chapter 4 C Language Product Identification Product Identification INTERACTIVE UNIX System V 386 Release 3 2 Version Release No 3 0 If you do not supply this component yourself please identify below the supplier you reference Conformance Reference Indicator of Compliance VSX Test Suite Release 3 204 Testing Agency Name UniSoft Corporation Address 6121 Hollis Street Emeryville CA 94608 2092 Environment Specification Enter below details of the hardware and software environment in which testing took place including compilation routines and in stallation procedures if any Sufficient detail must be supplied to enable conformant behaviour and any test results to be reproduced Any 386 486 compatible system with at least 4 MB of RAM and the following INTERACTIVE UNIX System V 386 Release 3 2 Page 4 1 Conformance Statement XCS QUE 3 2 Questionnaire Version 3 0 subsets and extensions installed approximately 40 MB of disk space is needed Core Kernel Configuration File Management International Supplement INTERACTIVE Software Development
35. Questionnaire Rationale For an X Open conforming implementation the POSIX SAVED IDS option must be provided The other options may or may not be provided The provision of the file system related options can vary within a system For example a system which has E traditionally supported both System V and BSD type file systems may provide a mechanism whereby the option is enforced for certain files or processes but not for others This technique can be used to achieve a degree of backwards compatibility that would not other wise be possible Reference XPG3 Volume 2 Page 579 lt unistd h gt 2 1 2 C Standard Question 2 Does the implementation only support Common Usage C or also support ANSI C Standard interface definitions Answer s Only Common Usage C 1 Only Common Usage C 2 Both Common Usage C and ANSI C Rationale The POSIX 1 standard allows for a conforming system to support either Common Usage C or ANSI C Standard interface definitions The XPG is based on a Common Usage C definition but does not prohibit an ANSI C implementation A Common Usage C definition must provide function declarations for the C language functions in the XPG as well as providing function semantics that conform to the XPG An ANSI C Standard interface must provide function proto types and ANSI C semantics as well as providing XPG semantics There are no known areas of contradiction between the ANSI C and XPG semantics Page 2 1
36. See Note 4 LE See Note 5 See Note 6 See Note See Note 3t 3b ox The character sequences ch and ss are defined as collating collating symbols lt UPPER CASE gt lt LOWER_CASE gt NO ACCENT GRAVE and X ACUTE are placed first in the ordering sequence followed by the space symbol Characters with code values between space and A are placed in the basic ordering sequence after the space but are ignored for collation purposes mE International Supplement Manual for Advanced Users 31 4 The accented and unaccented A s have the same primary weight that is they belong to an equivalence class The secondary weight is based on case but ignores accents The third weight considers accents This definition uses the collat ing symbols and their relative order uppercase before lower case no accents before accents The definition can be viewed as a directive to transform strings by weight before comparing them For example when comparing the strings abb and Abba the two strings are first compared using the primary weight This equates to comparing ABBA with ABBA that is they compare as equals On secondary weighting they compare as follows lt LOWER_CASE gt lt LOWER_CASE gt lt LOWER_CASE gt lt LOWER_CASE gt against lt UPPER_CASE gt lt LOWER_CASE gt lt LOWER_CASE gt lt LOWER_CASE gt The first collates after the second The accented and unaccented C s also belong to an equivale
37. System Temporary Waivers List below references to any temporary waivers granted by X Open in respect of minor errors in the product referenced above This should include the X Open reference and the waiver expiry date The waivers as granted shall be made available with this document on request Page 4 2 FS X Open Conformance Statement XCS QUE 3 2 Questionnaire Section 4 1 Implementation Limits Question 1 What limits does the implementation impose on the significant part of a identifier Answer External identifiers an infinite number of characters Non External identifiers an infinite number of characters Rationale The XPG states that while there is no limit to the length of an identifier only a certain number of characters are significant The XPG points out that there must be at least eight characters for a non external name but may be less for external names Reference XPG3 Volume 4 Page 3 Lexical Conventions Page 4 1 1 Conformance Statement XCS QUE 3 2 Questionnaire Section 4 2 General Question 2 What truncation rules are applied when a floating value is converted to an integral value Answer Truncation toward Zero Options A description of the manner in which floating values are converted The description should address the rules for truncation of both posi tive and negative values Rationale The XPG states that such conversions are machine dependent In particular
38. all previously installed key board mapping is automatically disabled until the user leaves the VP ix Environment If a non U S keyboard is used DOS must be informed With the VP ix Environment the system administrator can choose to give each VP ix user an individual C drive this is a virtual disk drive in reality a UNIX System file that contains DOS and is used to boot it or to use a system wide C drive When a non U S keyboard is used using individual C drives is preferable because this drive contains the essential DOS system files CONFIG SYS and AUTOEXEC BAT that need to be edited to insert information about the keyboard and language used as well as which country s conventions should be applied Refer to the docu mentation that accompanied your DOS system for details 4 7 Entering Data and Using INTERACTIVE X11 When INTERACTIVE X11 is used with the system a special pro gram called a display server is invoked This program switches the system from a character based environment to an all graphical environment From that point on all mapping informa tion specified through the ttymap interface is no longer used The server program is responsible for performing the correct actions each time a key is pressed on the keyboard By default it treats any keyboard as a U S keyboard A utility called xttymap is provided to change the default actions of the server It can read and interpret the same input file that is used with ttymap
39. as numeric isspace Character codes to be classified as spacing delimiter characters ispunct Character codes to be classified as punctuation characters iscntrl Character codes to be classified as control characters isblank Character code for the space character isxdigit Character codes to be classified as hexadecimal digits ul Relationship between uppercase and lowercase characters Any lines with a number sign in the first column are treated as comments and are ignored Blank lines are also ignored A character can be represented as a hexadecimal or octal constant for example the letter a can be represented as 0x61 in hexadecimal or 0141 in octal Hexadecimal and octal constants may be separated by one or more space or tab characters The dash character can be used to indicate a range of consecu tive numbers Zero or more space characters may be used for separating the dash character from the numbers The backslash character is used for line continuation Only a carriage return is permitted after the backslash character The relationship between uppercase and lowercase letters ul is expressed as ordered pairs of octal or hexadecimal constants lt uppercase_character lowercase_character gt These two con stants may be separated by one or more space characters Zero or International Supplement Manual for Advanced Users 17 more space characters may be used for separating the angle brack ets lt
40. character fol lowed by two other keystrokes will generate a single character As an example COMPOSE followed by the plus and the minus sign could generate the plus minus sign Compose sequences can also be used as an alternative for deadkeys e g COMPOSE instead of e to get 2 Decimal representation Rarely used characters can be generated by pressing the com pose key followed by three digits Toggle key An optional toggle key can be defined to temporarily disable the current mapping from within an application This can be me useful when for example a German programmer wants easy access to the curly braces and the brackets Scancode Mapping he keyboards of the console and some other peripherals such as SunRiver workstations behave differently than those of regular termi nals They generate what are called scancodes and you will also find a number of keys on these keyboards such as the ALT key that are not INTERACTIVE UNIX System 2 International Supplement ttymap 1 ttymap 1 found on regular terminals Scancodes generated by PC keyboards typically represent the location of the key on the keyboard The key board driver has to properly translate these scancodes The different national variants of a PC keyboard not only have non English charac ters printed on some of the keycaps but the order of some of the keys is different as well Without changing the scancode translation a French user would type A a
41. chrtbl 1M chrclass name of the data file to be created by chrtbl isupper character codes to be classified as uppercase letters islower character codes to be classified as lowercase letters isdigit character codes to be classified as numeric isspace character codes to be classified as a spacing delimiter character ispunct character codes to be classified as a punctuation character iscntrl character codes to be classified as a control char acter isblank character code for the space character isxdigit character codes to be classified as hexadecimal digits ul relationship between uppercase and lowercase characters Any lines with the number sign in the first column are treated as comments and are ignored Blank lines are also ignored A character can be represented as a hexadecimal or octal constant for example the letter a can be represented as Ox61 in hexadecimal or 0141 in octal Hexadecimal and octal constants may be separated by one or more space and tab characters The dash character may be used to indicate a range of consecutive numbers Zero or more space characters may be used for separating the dash character from the numbers The backslash character V is used for line continuation Only a car riage return is permitted after the backslash character The relationship between uppercase and lowercase letters ul is expressed as ordered pairs of octal or hexadecimal constants lt upper case
42. describes the layout of a mapfile that is read by the ttymap program A mapfile is a text file that consists of several sections A sharp sign can be used to include comments Everything following the until the end of the line will be ignored by the ttymap program Inside a line C style comments can be used as well The beginning of each section is indicated by a keyword Spaces and tabs are silently ignored and can be used at all times to improve readability All but one sec tion the one that defines the compose character can be left out The order in which the different sections should appear is predefined Here is the list of keywords in the order they should appear input toggle dead compose output scancodes Characters can be described in several different ways ASCII charac ters can be described by putting them between single quotes For example Lu INTERACTIVE UNIX System 4 International Supplement ttymap 1 ttymap 1 Between single quotes control characters can be listed by using a circumflex sign before the character that needs to be quoted For example y 9 X When a backslash X is used what follows will be interpreted as a decimal octal leading zero or hexadecimal leading x or X representation of the character although in this case the use of single quotes is not mandatory For example Ax88 is the same as 0x88 zero needed when not quoted and 007 is the same
43. gt from the numbers 4 3 1 An Example of a Character Classification Definition The following is an example of an input file chrclass LC CTYPE isupper 0x41 0x5a islower 0x61 0 7 isdigit 0x30 0x39 isspace 0x20 0x9 Oxd ispunct 0x21 Ox2f 0x3a 0x40 Ox5b 0x60 Ox7b 0 7 iscntrl 0 0 Ox1f Ox7f isblank 0x20 isxdigit 0x30 0x39 0x61 0x66 0x41 0x46 ul lt 0x41 0x61 gt lt 0x42 0x62 gt lt 0x43 0x63 gt lt 0x44 0x64 gt lt 0x45 0x65 gt lt 0x46 0x66 gt lt 0x47 0x67 gt lt 0x48 0x68 gt lt 0x49 0x69 gt lt 0 4 0x6a gt lt 0x4b 0x6b gt lt 0 4 0x6c gt lt 0x4d 0 6 gt lt 0x4e 0 gt lt 0x4f 0x6f gt lt 0x50 0x70 gt lt 0x51 0x71 gt lt 0x52 0x72 gt X0x53 0x73 0x54 0x74 gt lt 0x55 0x75 gt lt 0 56 0x76 lt 0 57 0x77 gt 0x58 0x78 lt 0x59 0x79 gt 0x5a 0x7a gt 4 3 2 How a Program Uses This Information Programs access this information by using the character classification and conversion library interfaces refer to ctype 3C As vi does not use the information via the locale we recom mend that the table also be copied to the 1ib chrclass direc tory and given the same name as the 1ocale 4 3 3 Use in Regular Expressions and Shell Pattern Matching The information in the character classification definition can be directly used in regular expressions via the character class syntax inside a bracket expression The s
44. in the optional EXTENDED_CHARMAP section The format is the same as in the charmap section except that the encoding consists of two or more concatenated constants for example EXTENDED CHARMAP lt A acute gt d039 d065 END EXTENDED CHARMAP 8859 is used as a synonym for the ISO IEC 8859 1 codeset INTERACTIVE UNIX System 3 International Supplement charmap 5P charmap 5P FILES User defined charmap files must be stored in the lib charmap directory lib charmap Default directory for charmap files is the name of charmap file lib charmap ASCII cmap Contains ASCII charmap entries lib charmap 437 cmap Contains IBM codepage 437 charmap entries lib charmap 850 cmap Contains IBM codepage 850 charmap entries lib charmap 8859 cmap Contains ISO IEC 8859 1 entries SEE ALS colldef 1P iconv 1P INTERACTIVE UNIX System 4 International Supplement langinfo 5P langinfo 5 langinfo language information DESCRIPTION The langinfo h header file defines the symbolic constants to be used in the nl langinfo function to retrieve langinfo data The mode of the constants is given in nl types h The following symbolic constants are recognized D_T_FMT D_FMT T_FMT AM_STR PM_STR DAY_1 DAY 2 DAY 3 DAY 4 DAY_5 DAY_6 DAY_7 ABDAY 1 ABDAY 2 ABDAY 3 ABDAY 4 ABDAY 5 ABDAY_6 ABDAY_7 MON _1 MON 2 MON_3 MON_4 MON_5 MON_6 MON_7 MON_8
45. in a locale directory 5 1 When to Use a Collation Sequence A created and installed collation sequence definition is not activated until the user specifies that it should be used To do this set the LC ALL LC COLLATE or LANG environment variable to the directory in which the files are stored This must be done before a program using the stored definitions is executed Note that the pro gram must be set up to check and set the international environment via the setlocale function User defined collation is supported through the colldef utility and the library functions strxfrm and strcoll refer to strxfrm 3P and strcoll 3P for more information These func tions are used to compare strings based on the defined collation order and rules Traditional programs that need to do sorting use strcmp which does byte to byte comparison In the INTER ACTIVE UNIX Operating System the standard utilities that depend on collation such as sort and 1s have been modified to use the international environment refer to string 3P for information 5 2 Defining Collation Collation according to a dictionary is the of putting things in their proper order Collation rules define how the data are put in the proper order or sorted Traditionally the collating order in the UNIX System has been ASCII order that is the order in which the characters appear in the ASCII codeset This is also the natural col lating order for the
46. is installed using sysadm installpkg in the same manner as other INTERACTIVE sub sets or extensions e For information about installing optional subsets refer to section 6 1 of the INTERACTIVE UNIX Operating System Installation Instructions in the INTERACTIVE UNIX Operating System Guide e For information about using sysadm refer to sections 2 and 3 of the INTERACTIVE UNIX Operating System Maintenance Procedures in the INTERACTIVE UNIX Operating System Guide After you have installed the International Supplement your INTER ACTIVE UNIX System will contain internationalised versions of several UNIX System commands such as date and who These are installed in the standard UNIX System directories where they belong for example bin and usr bin Copies of the original binaries can be found in a subdirectory of the original directory called SYSV for example bin sysV and usr bin sysV Refer to section 10 of the International Supplement User s Manual for a list of the internationalised com mands and functionality In addition to the commands specified by XPG3 INTERACTIVE has added the colldef and showcat commands Refer to colidef 1P and showcat 1P for more information The supple ment also contains sample files for locales message catalogues and charmap files the latter are used by iconv 1P and colidef lP locales installed the directory lib 1locale ISC Where appropriate source files for t
47. level will consider the relative position of non IGNOREed elements in the string such that if strings compare as equals the element with the shortest distance from the starting point of the string is collated first The directives forward and backward are mutually exclusive For example order start forward backward forward The absence of operands for this keyword is taken as a directive to perform comparisons on a character basis rather than on a string basis 5 5 4 1 Collation Order The order start keyword is followed by collating element entries The syntax for the collating element entries is collating element weight weight Each collating element consists of either a character in any of the forms defined above a collating element sym bol a collating symbol symbol an ellipsis or the special symbol UNDEFINED The order in which collating elements specified determines the character collation sequence such that each collating element compares less than the elements following it The NULL character compares lower than any other character A collating element symbol is used to specify multicharac ter collating elements and indicates that the character sequence specified via the collating element symbol is to be collated as a unit and in the relative order specified by its place A collating symbol symbol is used to define a position in the relative order for use in weights 28 International Supplemen
48. lt GRAVE gt collating symbol ACUTE 5 5 8 substitute Keyword The substitute keyword is used to define a substring substitu tion in a string to be collated The syntax is substitute regexp with 26 International Supplement Manual for Advanced Users The first operand is treated as a simple regular expression The replacement operand consists of zero or more characters and regular expression backreferences for example 1 through V9 When strings are collated based on a collation definition containing substitute statements any substitutions are performed before strings are compared For instance if you have a substitute statement substitute Mc with Mac and you compare the two strings McArthur and MacArthur the substitute is first applied to both strings As a result the first string is replaced by MacArthur and the two strings compare as equals Ranges in the regular expression are interpreted according to the current character collation sequence and character classes are inter preted according to the character classification specified via the LC CTYPE environment variable at collation time If more than one substitute statement is present in the collation definition the substitute statements are applied in the order in which they occur in the source definition Both operands must be enclosed within double quotes or a null replacement is indicated by two adjacent double quotes For exam ple
49. not present source definitions are read from standard input s When this flag is used the colldef command will not print warning messages The locale argument identifies the target locale If the argument con tains one or more slash characters or consists of dot it will be inter preted as an absolute path name for the directory in which the created collation table will be stored Otherwise the argument is interpreted as the name of a directory under lib locale ISC The created colla tion table is stored in a file named LC_COLLATE within the locale directory The character set mapping file specified as the charmap option argument is described under charmap 5P The collation source definition file contains statements describing the desired collation behaviour Each statement consists of a keyword optionally followed by arguments and by collation order entries The following keywords are recognised LC COLLATE This keyword must be the first in the file collating symbol This keyword names symbolic names used in colla tion order entries collating element This keyword defines multi character collating elements substitute This keyword describes regular expression type substitutes INTERACTIVE UNIX System 1 International Supplement colldef 1P colldef 1P order start This keyword defines the collation evaluation direc tion and immediately precedes the collation order entries order end This keyword imme
50. of these environment variable searches yields a locale that is not supported and non null the setlocale func tion returns a NULL pointer and the program s locale is not changed If all environment variables name supported locales setlocale then proceeds as if it had been called for each category using the appropri ate value from the associated environment variable or from the default locale if there is no such value RETURN VALUES A successful call to setlocale returns a string that corresponds to the locale set The string is such that a subsequent call with that string and its associated category will restore that part of the program s locale The string returned shall not be modified by the program and may be overwritten by a subsequent call to the setlocale function RESTRICTIONS The LC_ALL environment variable is an extension to the X Open specification it is derived from the 1990 C language standard The LC_MESSAGES category and environment variable is also an extension to the X Open specification it is added in anticipation of the POSIX 2 standard Portable programs should avoid using or depending on these environ ment variables and on the LC_MESSAGES category NOTES For information on how a locale is defined see ocale 5P SEE ALSO localeconv 3P NOTE TO USERS This entry is reprinted from the INTERACTIVE SDS Guide and Programmer s Reference Manual INTERACTIVE UNIX System 2 International Supplement s
51. prepare and install a properly func tioning international environment on an INTERACTIVE UNIX Operating System It also summarizes the internationalisation features and provides tips for C programmers who want to develop internationalised applications Developers of such applications should also consult the X Open Portability Guide Note that before reading this document you should have already read the International Supplement User s Manual 2 International Supplement Manual for Advanced Users 2 SETTING UP THE ENVIRONMENT FOR USERS TERMINALS This section describes how a system administrator can configure the terminals on the system to use the appropriate codesets and the key boards supported by those terminals It also explains the need for character mapping ability and give tips for establishing the correct mapping from boot time 2 1 Motivation The original UNIX Operating System and most systems derived from it have been based on the ASCII 7 bit coded character set and American English The ASCII character set consists of 128 different characters each represented by a single byte the eighth bit is not used Beginning with UNIX System V Release 3 1 most applications have been modified to properly support characters represented as a byte with the eighth bit set as well This means that now 256 characters can be supported at the same time A con sistent coding convention needs to be applied however In the IBM PC world a
52. reference Conformance Reference Indicator of Compliance None Environment Specification Enter below details of the hardware and software environment in which conformance is claimed including compilation routines and installation procedures if any Sufficient detail must be supplied to enable conformant behaviour to be reproduced Any 386 486 compatible system with at least 4 MB of RAM and the following INTERACTIVE UNIX System V 386 Release 3 2 Version 3 0 subsets and extensions installed approximately 40 MB of disk space is needed Page 3 1 X Open Conformance Statement XCS QUE 3 2 Questionnaire Core Kernel Configuration File Management International Supplement INTERACTIVE Software Development System Conformance Expectations Volume 1 of XPG3 recognises that convergence of implementa tions towards a common specification for commands and utilities is not yet complete and therefore does not require a vendor to supply all of the commands and utilities and individual options specified in XPG3 This chapter explicitly identifies those commands and utilities not supplied by the vendor and any supplied which do not conform to the published specification Reference XPG3 Volume 1 Page 1 Page 3 2 X Open Conformance Statement XCS QUE 3 2 Questionnaire Section 3 1 Basic Utilities 3 1 1 Supported Commands Question 1 Which of the basic utilities non development utilities defined in the XPG are not pro
53. registered trademark of Adobe Systems Incorporated DEC and VT220 are trademarks of Digital Equipment Corporation 386 and 486 are trademarks of Intel Corporation AT and IBM are registered trademarks of International Business Machines Corporation PC XT is a trademark of International Business Machines Corporation MS DOS is a registered trademark of Microsoft Corporation SunRiver is a registered trademark of SunRiver Corporation X Open is a trademark of X Open Company Limited International Supplement Guide CONTENTS International Supplement Overview and Installation Instructions International Supplement User s Manual International Supplement Manual for Advanced Users X Open Conformance Statement Questionnaire International Supplement Reference Manual International Supplement Overview and Installation Instructions CONTENTS 1 OVERVIEW NL 2 INSTALLATION INSTRUCTIONS 3 DOCUMENTATION REFERENCES ET International Supplement Overview and Installation Instructions 1 OVERVIEW INTERACTIVE s International Supplement extends the INTER ACTIVE UNIX System V 386 Release 3 2 Operating System for use in an international environment It allows software vendors to develop their applications in such a way that the text of one single application can be displayed in a different language depending on the environment in which it is executed a separate copy of the application for each language is not required The In
54. sequences e The decimal representation of the character 4 2 1 Deadkeys The deadkey was invented by typewriter manufacturers For exam ple imagine you need the French character A French typewriter does not have a key for this character but it has keys for both e and When the key is pressed a circumflex is printed but the type writer carriage does not move When the e key is then pressed the letter is printed on the same spot as the circumflex and an is formed This technique works very similarly on a terminal The only difference is that when is pressed nothing happens until e is pressed after which the character appears on the screen A utility developed by INTERACTIVE that can be used to assign deadkeys ttymap is supplied with the INTERACTIVE UNIX Operating System This utility is used to do everything discussed in 10 International Supplement User s Manual this section To define as a deadkey and try the other examples listed below type the command ttymap usr lib keyboard usa map Now when you press nothing appears on the screen When an e is typed next the letter appears To use the character alone press first and then the spacebar If a sequence of two characters is typed that does not make sense at all no character is sent to the application that is currently being used and the machine beeps to indicate that an erroneous combination was typed 4 2 2 Composing Chara
55. should preceed the specified character The normal field defines how the scancode is translated when no other key is pressed the shift field defines the translation for when the SHIFT key is used simultaneously the alt field specifies what to do when the ALT key is pressed together with this and the shiftalt field contains the information on what to generate when both the SHIFT and ALT keys are pressed All five fields must be filled in When no translation is requested that is the current active translation does not need to be changed a dash can be used The sixth field is optional This field can contain the special keyword CAPS or NUM or both to indicate whether or not the CAPS LOCK key or NUM LOCK key status have any effect Here is a sample line that describes the default translation for the Q key 0x10 IC q IN Q IN CAPS If the normal or shift field is filled out for a scancode that represents a function key a self explanatory message will be produced and that translation information will be ignored A more detailed example of a scancodes section is scancodes the w key Oxll w IC w IN W IN CAPS left square bracket and curly brace key control shift does not generate anything no C flag Oxla PIC P PIN RN 9 on numeric keypad Ox49 V I 9 PIN FIN NUM F13 d a t e 0 SHIFT Fl More complete examples of mapfiles can be found usr lib keyboard usa map and usr l
56. space or tab Operands are characters strings of characters or digits When a keyword is followed by more than one operand the operands must be separated by semicolons Blanks are allowed before and or after a semicolon Strings must be surrounded by quotes Indivi dual characters may be surrounded by quotes but it is not required Blank lines or lines containing a number sign in the first column are ignored A line can be continued by typing a backslash V as the last character on the line The following keywords are recognised LC TIME The header abday Defines the abbreviated names of the week days starting with Sunday day Defines the names of the weekdays starting with Sunday abmon Defines the abbreviated names of the months starting with January mon Defines the names of the months starting with January 12 International Supplement Manual for Advanced Users t fmt Defines the format of the time string d fmt Defines the format of the date string d t fmt Defines the format of the combined date and time string am pm Defines the strings used to specify ante meri diem and post meridiem in a time string according to the 12 hour clock t fmt ampm Defines the format of the 12 hour time display END LC TIME The trailer Refer to date 1 for more information about date field descriptors 3 3 1 abday Keyword This keyword defines the abbreviated weekday names corresponding to the date field de
57. substitute Mc with 9 5 4 order start Keyword The order start keyword precedes collation order entries and also defines the number of weights for this collation sequence definition and other collation rules The syntax of the order start keyword is order start sort rules sort rules The operands to the order start keyword are optional If present the operands define rules to be applied when strings are compared The number of operands defines how many weights each element is assigned if no operands are present one forward operand is assumed If present the first operand defines rules to be applied when comparing strings using the first primary weight the second when comparing strings using the second weight and so on Operands are separated by semicolons Each operand consists of one or more collation directives separated by commas If the number of operands exceeds the COLL International Supplement Manual for Advanced Users 27 limit the utility ignores the operands in excess of the limit and issues a warning message The following directives are supported forward Specifies that comparison operations for the weight level proceed from the beginning of the string to the end of the string backward Specifies that comparison operations for the weight level proceed from the end of the string to the beginning of the string position Specifies that comparison operations for the weight
58. the characters such as the used in previous examples are accessed using a deadkey most of the others are printed on a keycap The keys that are used for symbols such as the square bracket and curly brace on U S keyboards have local language accented characters printed on them rather than the American char acters see Figure 1 Although not often used in text these sym bols are certainly important in the context of the UNIX Operating System especially when the system is used for C programming Having sacrificed these symbols to support the local language there must be an alternative way of obtaining them The solution pro vided by most keyboard manufacturers is to print three symbols on the top row keys In addition to the digits and symbols such as plus and minus the braces and brackets are printed either in the right bottom corner or on the front of the keycap To generate these symbols press the key simultaneously with the right key When using the INTERACTIVE UNIX Operating System no distinction is made between the left and the right key but in certain applications such as those based on X11 a distinction is made In the INTERACTIVE UNIX Operating System ttymap input files are provided for all major European keyboards When the sys tem is properly configured by the system administrator keyboards function correctly without user intervention even before logging into the system an INTERACTIVE feature
59. used to reference U S English keyboard layouts Name used to reference German keyboard layouts Random Access Memory Read Only Memory System V Interface Definition A self contained unit with a keyboard and a screen that is connected to a serial port of a computer Three letter sequences used in an ANSI C source file that are interpreted as a single symbol This is essential to the C language X Open Portability Guide Issue 3 X Open System Interface P Q International Supplement Manual for Advanced Users CONTENTS l INTRODUCTION 2 SETTING UP THE ENVIRONMENT FOR USERS TERMINALS 2 1 Motivation 2 2 Mapping Features 2 3 The ttymap Program 2 3 1 A Sample mapfile 2 4 Activating Mapping Prior to Login 2 4 1 The System Console 2 4 0 Changing the Default Font for the Console 2 4 3 Other Teminis 2 4 4 User Specific Configuration 2 4 5 General ttymap Guidelines 3 SPECIFYING DATE AND TIME FORMATS 3 1 When to Use the Date and Time locale Category 3 3 Definition 2 Date and Time Formatting 3 Creating a Date and Time Formatting 3 3 1 abday Keyword 99 day Keyword mon Keyword iei D AW 0 A Sample File 1 Information 4 SPECIFYING CHARACTER CLASSIFICATION INFORMATION abmon Keyword d t fmt Keyword d fmt Keyword t fmt Keyword am pm Keyword I t fmt ampm Keyword How a Program Uses This ON ON tA gt t2
60. used to separate groups of digits thousands separator 4 The size of such groups The content and placement of strings used to denote the currency 6 Positive and negative signs and their placement 6 7 Creating a Monetary Category Definition The source language for the monetary category in the INTER ACTIVE UNIX Operating System is the language defined by the POSIX 2 group for the LC MONETARY locale category International Supplement Manual for Advanced Users 37 A monetary editing source definition consists of a header a mone tary editing body and a trailer The header is the word LC MONETARY The trailer is the string END LC MONETARY The monetary editing body consists of one or more lines of text Each line contains a keyword followed by one or more operands Keywords are separated from the operands by one or more blank characters space or tab Operands are characters strings of characters or digits When a keyword is followed by more than one operand the operands must be separated by semicolons Blank characters are allowed before and or after a semicolon Strings must be surrounded by quotes Individual characters may be surrounded by quotes but it is not required Blank lines or lines containing a number sign in the first column are ignored The following keywords are recognised int curr symbol Defines the ISO standard four character three letters and a space code for currency for example USD f
61. value is the separator used to format mone tary values mon grouping The value is a string of semicolon separated numbers as described in ocaleconv 3P positive sign The string used to indicate a value for a non negative formatted monetary quantity negative sign The string used to indicate a negative valued for matted monetary quantity INTERACTIVE UNIX System 3 International Supplement locale 5P int frac digits frac digits p cs precedes p sep by space n cs precedes n sep by space p sign posn n sign posn locale 5P The number of fractional digits those after the decimal delimiter to be displayed in an interna tionally formatted monetary quantity The number of fractional digits those after the decimal delimiter to be displayed in a formatted monetary quantity Set to 1 or 0 if the currency symbol respectively precedes or succeeds the value for a non negative formatted monetary quantity Set to 1 or O if the currency symbol respectively is or is not separated by a space from the value for a non negative formatted monetary quantity Set to 1 or O if the currency symbol respectively is or is not separated by a space from the value for a negative formatted monetary quantity Set to 1 or 0 if the currency symbol respectively is or is not separated by a space from the value for a negative formatted monetary quantity Set to a value indicating the positioning of the positive
62. was English The array pointed to by the return value should not be modified by the program but may be modified by further calls to nl langinfo In addition calls to the setlocale 3P function with a category corresponding to the category of item or to the category LC_ALL may overwrite the array RETURN VALUES In a locale where lJanginfo data is not defined nl_langinfo returns a pointer to the corresponding string in the C locale In all locales nl langinfo returns a pointer to an empty string if item contains an invalid setting ERRORS No errors are defined SEE ALSO setlocale 3P langinfo 5P locale 5P NOTE TO USERS This entry is reprinted from the INTERACTIVE SDS Guide and Programmer s Reference Manual INTERACTIVE UNIX System 1 International Supplement setlocale 3P setlocale 3P NAME setlocale locale control SYNOPSIS include lt locale h gt char setlocale int category const char locale is DESCRIPTION The setlocale function sets changes or queries the program s locale according to the values of the category and locale arguments The pos sible values for category are LC ALL Names the entire locale LC COLLATE Affects the behaviour of the string collation functions LC CTYPE Affects the behaviour of the character handling func tions The functions isdigit and isxdigit are not affected by the current locale LC MESSAGES Affects the interpretation of the strings associated with affi
63. yes and negative no responses to program queries 7 1 Reasons for Defining Yes No Responses The standard UNIX System utilities that require this kind of interaction such as rm normally expect either a y or an n In countries that do not normally use the English language this is not the obvious response In France for instance the obvious affirmative response would be o for oui in Spain it would be s for si 7 2 Defining Yes No Responses These definitions are created by placing a specification in the LC MESSAGES file in a locale directory 7 3 When to Use the Yes No Response locale Category The created and installed definitions are not activated until the user specifies that they should be used To do this the user must set the LC ALL LC MESSAGES or LANG environment variable to the directory in which the files are stored This must be done before a program using the stored definitions is executed Note that the pro gram must be set up to check and set the international environment via the setlocale function In the INTERACTIVE UNIX Sys tem the standard utilities that depend on a yes no response such as ln and rm have been modified to use the international environ ment Note that while the internationalised yes no response is required by XPG3 for certain commands the LC MESSAGES category is not part of the locale as defined by XPG3 7 4 Creating a Yes No Response Category Definition The source language for the
64. 00 represents the first serial port of the computer To test the new configuration first kill any existing getty processes for the devices with entries that have been changed then as superuser type telinit This has the system reread the etc inittab file This file is recreated each time a new UNIX System kernel is built using infor mation stored in other files Therefore one more step needs to be taken after the terminal setup has been successfully tested Add the same line with getty m to either etc conf cf d init base the base inittab file that contains information about the console or the file in the directory etc conf init d that corresponds to the device driver of the peripheral to which the terminal is attached for example asy for the serial port 2 4 4 User Specific Configuration The configuration guidelines given in the previous section assume that all users of a particular terminal use the system in the same fashion This may not always be the case A French user using a U S terminal may want to see a circumflex defined as a deadkey an American user would not If this is the case you can add the appropriate loadfont or ttymap commands to the user s 8 International Supplement Manual for Advanced Users HOME profile file for Bourne Shell users or to the appropri ate user specific configuration files for other shells These com mands override the system wide configuration 2 4 5 General ttymap Guideli
65. 03 104 h 105 106 j 107 k 108 1 109m 110 n 1110 112 113 q 114r 115 8 116 t 117 u 118v 119 w 120 121 122 2 123 124 1325 126 127 There are a few interesting points about the ASCII codeset Upper case characters are represented using lower numbers than lowercase characters and the difference between the value of an uppercase character and its corresponding lowercase character is constant 32 This has often been used and misused by programmers The last character 127 is not always printable This does not cause any problems as this character is used by the INTERACTIVE UNIX Operating System as the DELETE character to interrupt programs The ASCII codeset contains all letters of the English alphabet and none of the additional letters used in French German and other languages 5 2 8 bit Characters and Codesets Inside the computer 7 bit numbers are actually stored as 8 bit enti ties In most computers a byte 8 bits or a series of 8 possible zeroes and ones is the smallest possible unit used to store informa tion which makes it possible to actually use 256 different characters and symbols Today this is true if you use the console If you have a compiler on your system you can compile and run the following program define XOPEN SOURCE include lt stdio h gt main argc argv int argc char argv int c 32 while lt 255 printf X4d Xc c c if c 1 8 0 printf n
66. 2 X Open Conformance Statement Questionnaire Reference XCS QUE 3 2 XPG3 Volume 2 Page 12 The Compilation Environment 2 1 3 Limit Values Question 3 What are the values associated with the following limits specified in the limits h header file Answer Macro Name ARG_MAX CHILD_MAX LINK_MAX MAX_CANON MAX_INPUT Meaning Max length of argument list and environ ment data Max number of processes per user ID Max number of links to a single file Max bytes in a terminal canonical input line Max bytes a terminal input queue Minimum Maximum 5120 5120 15 60 1000 1000 255 255 255 255 Page 2 1 3 X Open Conformance Statement XCS QUE 3 2 Questionnaire NAME MAX Max charac 14 14 ters in a file name OPEN Max number 20 100 of files open in a process PASS MAX Max 8 8 significant characters in a password PATH MAX Max charac 255 255 ters in a path name PIPE_BUF Max bytes in 10240 10240 an atomic write to a pipe NGROUPS_MAX Max number 16 16 of supplemen tary group IDs TMP_MAX Max number 17576 17576 of unique tem porary file names Options Specify a minimum and maximum limit for each limit value The minimum limit should be the result of evaluating the associated macro in lt limits h gt The maximum limit should be the largest value that is returned from sysconf or pathconf The maximum values can be specified as indeterminate Pa
67. 859 codesets each for a different territory The most important ones are ISO Codeset ISO 8859 1 ISO 8859 2 ISO 8859 5 ISO 8859 7 Territory or Languages Intended Western Europe Eastern Europe English Czech Polish and so on English and Russian alphabet English and Greek alphabet 5 5 7 bit Codesets Earlier in this document we described terminals that support only 128 different characters and use a key to select a language or country The 7 bit characters generated by most of these termi nals follow an ISO standard convention ISO 646 which is the ISO code name for the ASCII standard For use with languages other than English the local language letters are substituted for symbols such as 5 6 Choosing and Configuring a Codeset It is the system administrator s responsibility to deal with codesets The INTERACTIVE UNIX System utility that configures the system to correctly store characters that are generated by the keyboard is the same utility that is used to configure the keyboard ttymap The system administrator has to verify that data storage happens consistently regardless of the type of terminal used Otherwise what was edited as a on the console yesterday may appear as a on a regular terminal today The system administrator must choose between one of the IBM codepages and one of the ISO 8859 conventions The first issue that determines that decision is obvious which language
68. All programs that are written using functions such as isupper have access to this mechanism International Supplement User s Manual 39 10 INTERNATIONALISED INTERACTIVE UNIX SYSTEM UTILITIES To use the internationalisation features described in the previous sections a number of UNIX System utilities needed to be modified Their enhanced behaviour complies with the specifications listed in volume 1 of the X Open Portability Guide Issue 3 Most of the differences in behaviour are transparent to the user In most cases when no local environment 1ocale is set up the behaviour defaults to the standard System V behaviour The manual entries in the INTERACTIVE UNIX System User s System Administrator s Reference Manual have not been modified to reflect the interna tionalised behaviour Refer to volume 1 of the X Open Portability Guide for more details These utilities are supplied with the International Supplement and are installed in the directories where the original UNIX System V utilities are located A list of these utilities and the locale categories they understand follows One category described in the International Supplement Manual for Advanced Users which deals with regular expressions is referred to as Internationalised Regular Expressions Int RE The following utilities are supplied Categories Int LC LC LC LC LC UU COLL NUME MESS RE oie ATE TIME RIC AGES ar Y awk Y Y Y Y comm cp Y
69. C msgcat name The function returns a message catalogue descriptor type 1 defined in the include file n1 types h Refer to the hello c sample file later in this section and to catopen 3P for more information e catgets This key function takes four arguments The first is the message catalogue descriptor returned by a previous catopen The second is the set number or identifier the default set identifier NL SETD is defined in n1 types h The third is the mes sage number or identifier The fourth is the default message in case no message catalogue is found or the specified message is not in the message catalogue Refer to catgets 3P for more information 48 International Supplement Manual for Advanced Users e catclose This should be used at the end of the program to close the previ ously opened message catalogue It takes one argument which is a message catalogue descriptor returned by previous catopen Refer to catclose 3P for more information A message catalogue can then be created containing the text of the local language This is a text file with a particular format refer to gencat 4P for details The gencat utility see gencat 1P should then be used to convert the message catalogue source into a real binary message catalogue INTERACTIVE has added a utility showcat that can be used to translate the contents of a message catalogue into its message text source that is the opposite of the gencat utilit
70. CIRCUMFLEX lt i grave gt d141 LATIN SMALL LETTER I WITH GRAVE lt A diaeresis gt d142 LATIN CAPITAL LETTER A WITH DIAERESIS lt A ring gt d143 LATIN CAPITAL LETTER A WITH RING ABOVE lt E acute gt d144 LATIN CAPITAL LETTER E WITH ACUTE lt ae gt d145 LATIN SMALL LETTER AE lt AE gt d146 LATIN CAPITAL LETTER AE lt o circumflex gt d147 LATIN SMALL LETTER O WITH CIRCUMFLEX 5 5 Source File Organisation The source file contains the following keywords described in detail in the following sections LC amp COLLATE The header collating element A collating element keyword is used to specify multicharacter collating elements This keyword is optional collating symbol A collating symbol keyword is used to specify collation symbols for use in collation order statements This keyword is optional substitute Zero or more substitute keywords define mapping between strings This keyword is optional order start This keyword is followed by one or more collation order statements assigning character collation values and col lation weights to collating elements order end tX This keyword terminates the collation order lines END LC COLLATE The trailer International Supplement Manual for Advanced Users 25 5 5 1 collating element Keyword Every character in the character set is also a collating element If the language or application for which this collation sequence definition is intended also recognises multi
71. DEMAP If no mapping buffer is present for the terminal port correspond ing to d the ioctl returns 1 Otherwise it is reenabled A description of all ioct1 commands listed here and the structure of the mapping buffer can found the file usr include sys emap h 8 2 Giving Programs Access to 1ocales The setlocale function sets changes or queries the program s locale according to the values of the category and 1ocale arguments Therefore every program that wants to take advantage of the internationalisation features described in this document and the International Supplement Users Manual should at a minimum contain the following statements include lt locale h gt and setlocale LC_ALL The latter statement causes the program to find out the current locale value If the second argument is not an empty string it sets the locale instead Refer to setlocale 3P for more information 8 3 Date and Time In order to have access to the date and time information a setlocale statement must be part of the program If all other locale categories are not to be used setlocale LC TIME is sufficient In addition the strftime function should be used instead of the traditional c time Refer to ctime 3P for more information 46 International Supplement Manual for Advanced Users When in the flow of the program the value of the local day or month is needed the n1 1langinfo function can be used It r
72. ERTY CAE codepage codeset collation The mechanism used to communicate the location of local language information American National Standards Institute American Standard Code for Information Inter change Name used to reference French keyboard layouts Common Applications Environment A codeset This term is used in the DOS world particularly by IBM A convention describing one to one relationships between symbols and numbers It represents letters as numbers that can be stored in a computer s memory The act of putting things in their proper order sorting compose sequence console deadkey A special key or sequence of keys used to put the keyboard into a special mode where the system expects two more characters to be typed by the user before a character is generated The default key sequence for the INTERACTIVE UNIX Operating System is F1 A directly connected keyboard and a monitor attached to a computer s video card A procedure for overprinting invented by type writer manufacturers where when one key is pressed a character is printed but the typewriter carriage does not move until the second key key is pressed so that characters consisting of two separate characters such as can be formed The INTERACTIVE ttymap utility can be used to assign deadkeys The only difference is that when the first key is pressed nothing happens until the second key is pressed after which the entire character
73. International Supplement Guide Sun Microsystems inc Business V First printing October 1991 No part of this manual may be reproduced in any form or by any means without written permission of INTERACTIVE Systems Corporation 2401 Colorado Avenue Santa Monica California 90404 Copyright INTERACTIVE Systems Corporation 1985 1991 Copyright AT amp T Corporation 1987 1988 Copyright X Open Company Limited 1989 RESTRICTED RIGHTS For non U S Government use These programs are supplied under a license They may be used disclosed and or copied only as permitted under such license agreement Any copy must contain the above copyright notice and this restricted rights notice Use copying and or disclosure of the programs is strictly prohibited unless otherwise provided in the license agreement For U S Government use Use duplication or disclosure by the Government is subject to restrictions as set forth in FAR Section 52 227 14 Alternate III or subparagraph c 1 ii of the clause at DFARS 252 227 7013 Rights in Technical Data and Computer Software rights reserved Printed in the U S A The following trademarks shown as registered are registered in the United States and other countries TEN PLUS is a registered trademark of INTERACTIVE Systems Corporation VP ix is a trademark of INTERACTIVE Systems Corporation UNIX is a registered trademark of UNIX System Laboratories Inc Adobe is a
74. NIX System 6 International Supplement locale 5P locale 5P FILES lib locale ISC Default directory for locale directory struc tures is the name of the locale lib locale ISC LC_COLLATE Contains LC COLLATE information lib locale ISC LC_CTYPE Contains LC CTYPE information lib locale ISC LC MESSAGES Contains LC MESSAGES information lib locale ISC LC MONETARY Contains LC MONETARY information lib locale ISC LC_NUMERIC Contains LC NUMERIC information lib locale ISC LC TIME Contains LC TIME information SEE ALSO chrtbl 1M colldef 1P localeconv 3P setlocale 3P ctime 3P environ SP in the INTERACTIVE SDS Guide and Programmer s Reference Manual NOTE TO USERS This entry is reprinted from the INTERACTIVE SDS Guide and Programmer s Reference Manual INTERACTIVE UNIX System 7 International Supplement
75. NTERACTIVE UNIX System 2 International Supplement iconv 1P iconv 1P NAME iconv codeset conversion SYNOPSIS iconv S default char specification f fromcode t tocode file DESCRIPTION The iconv utility converts the encoding of characters in file from one codeset to another and writes the results to standard output The input and output codesets are identified by fromcode and tocode respec tively If no file argument is specified on the command line iconv reads the standard input Character encodings in either codeset may include single byte values e g for ISO standard ISO 8859 1 1987 characters or multi byte values e g for certain characters in ISO standard ISO 6937 1983 A character in the input stream that does not have a correspondin conversion in the to codeset defaults to the underscore character in the output stream The iconv utility contains six built in conversion tables When the f and t file specifications are both taken from the following list the built in conversion tables are used 437 IBM codepage 437 850 IBM codepage 850 8859 ISO IEC 8859 1 codeset If a path name does not contain a slash the program assumes that the file is located in the directory lib charmap Otherwise fromcode and tocode are path names for the charmap files The S command option allows the default character to be dynami cally changed The format of default char specification is either
76. Oh You are going to have a French or German version of your product as well But I18N does not refer to the translation of software but rather to its usability and translatability An internationalised application or computer system is one that can be adapted to different environ ments without needing modification The term ocalisation and its acronym LION is used to describe the adaptation of computer pro grams to a single language and or country which if mismanaged can be as costly as making a separate version for each language International Supplement User s Manual 3 3 THE X OPEN PORTABILITY GUIDE The term X Open is often associated with standards X Open is a trade name as well as trademark of X Open Company Limited This organisation started as a consortium of European computer manufacturers Bull ICL Siemens Olivetti Nixdorf and Philips whose principal aim is to increase the volume of applications avail able on their computer systems In parallel they have attempted to maximize the return on investments in software development made by users and independent software vendors JSVs Today almost all major computer manufacturers are members of the X Open group 3 1 Computer Applications and Portability In the sixties most computer applications were developed on and for a single proprietary computer system In order to make the same application run on a different computer system it had to be com pletely rewri
77. P Example LC TIME abday Sun Mon Tue Wed Thu Fri Sat day Sunday Monday Tuesday Wednesday Thursday Friday Saturday abmon Jan Feb Apr Jun Jul Aug Sep Oct Nov Dec mon January February March April May June July August September October November December t_fmt 5 d fmt Xd Xm Xy d t fmt a Xb HH XM XS XY am pm AM t fmt ampm XI XM XS Xp END LC TIME Locale Naming Conventions and Usage X Open recommends that locale names follow a certain convention The recommended format is language territory codeset amp modifier where language Indicates the language area e g fr for French territory Indicates the geographical area e g CH for Switzer land which controls for example monetary editing rules codeset Indicates the used code set e g 8859 modifier Can be used to distinguish between otherwise identical names for instance between two different collation sequences Example LANG fr FR 8859 LC COLLATE HOME mylocale In the above declarations the default locale is French France using the 8859 codeset 8859 is used as a synonym for the ISO IEC 8859 1 codeset also known as Latin 1 This is the ocale chosen for all categories except LC_COLLATE for which a private locale in the directory mylocale is chosen INTERACTIVE U
78. Release Page 2 1 X Open Conformance Statement XCS QUE 3 2 Questionnaire 3 2 Version 3 0 subsets and extensions installed approximately 40 MB of disk space is needed Core NN Kernel Configuration File Management International Supplement INTERACTIVE Software Development System Temporary Waivers List below references to any temporary waivers granted by um X Open in respect of minor errors in the product referenced l above This should include the X Open reference and the waiver expiry date The waivers as granted shall be made available with this document on request PG3 239 expiration date April 2 1992 Page 2 2 X Open Conformance Statement Questionnaire XCS QUE 3 2 Section 2 1 General Attributes 2 1 1 POSIX 1 Supported Features Question 1 Which of the following options specified in the lt unistd h gt header file are available on the system Answer Macro Name POSIX CHOWN RESTRICTED POSIX JOB CONTROL POSIX_NO_TRUNC POSIX_SAVED_IDS POSIX_VDISABLE Options Meaning The use of chown is restricted Job Control option Long path name components gen erate an error Effective user and group IDs are saved Terminal special characters can be disabled Provided Yes Yes Yes Yes When the option is variable a description is required for the cases over which the variations occur Page 2 1 1 X Open Conformance Statement XCS QUE 3 2
79. UNIX Operat ing System in textmode only accept constant width constant height fonts of certain sizes The utility also requires that there is a description of all 256 characters of the codeset used specified in the fontfile Certain attributes are not used by oadfont but are main tained for compatibility purposes As a consequence fontfiles used with oadfont can also be used for other purposes such as with the INTERACTIVE X11 Windowing Sys tem but not always the other way around Format A loadfont input file is a plain ASCII file containing only printable characters octal 40 through 176 and a carriage return at the end of each line The information about a particular font should be contained in a single file The file begins with information on the font in general followed by the information and bitmaps for the individual characters The file should contain bitmaps for all 256 characters and each character should be of the same size A font bitmap description file has the following general form where each item is contained on a separate line of text in the file Items on a line are separated by spaces The word STARTFONT followed by the version number 2 1 One or more lines beginning with the word COMMENT These lines can be used to add comments to the file and will be ignored by the oadfont program The word FONT followed by the full name of the font The name continues all the way to the end of the line
80. XOPEN SOURCE 8 1 Character Mapping We do not recommend trying to change the active character map ping from an application However some programs the VP ix Environment or vpix for example which uses MS DOS style DOS mapping might want to disable the mapping and set it back before exiting ioctl commands are available to do this The fol lowing syntax is used ioctl fd COMMAND buffer fd is the file descriptor for the tty port for which the COMM AND is intended buffer is a pointer of type unsigned char pointing to a buffer of size IK The following ioctl commands can be used e LDSMAP The buffer is checked for correctness If some pointers have the wrong value or the size of the buffer exceeds 1K the ioctl call fails and returns 1 Otherwise the buffer is copied into kernel space and mapping is activated e LDGMAP If no mapping buffer is present for the terminal port correspond ing to d the ioctl returns 1 Otherwise the content of the mapping buffer is copied from kernel space into buffer and the ioctl returns 0 International Supplement Manual for Advanced Users 45 e LDNMAP If no mapping buffer is present for the terminal port correspond ing to d the ioct1 returns 1 Otherwise the content of the mapping buffer is freed and mapping is disabled e LDDMAP If no mapping buffer is present for the terminal port correspond ing to d the ioctl returns 1 Otherwise mapping is tem porarily disabled e L
81. YING DATE AND TIME FORMATS Date and time formatting consists of rules that define how date and time strings appear These rules are created by placing specifications in the LC TIME file in a 1ocale directory The default conventions for the date and time format as well as the names for the days of the week and the months follow the U S conventions and are rarely applicable in other countries defining and using the date and time locale category you can ensure that the dates and times displayed by the system follow your conventions and use the local names of days and months 3 1 When to Use the Date and Time 1 1 Category A created and installed definition is not activated until the user specifies that it should be used To do this set the LC ALL LC TIME or LANG environment variable to the directory in which the files are stored This must be done before a program using the stored definitions is executed Note that the program must be set up to check and set the international environment via the setlocale function In the INTERACTIVE UNIX Operating System the standard utilities that display the date and time such as date and 15 have been modified to use the international environment 3 2 Date and Time Formatting Date and time formatting controls the appearance of date and time strings created by the system The following aspects of formatting are controlled via the LC TIME locale category e Format of the time display
82. _character lower case_character gt These two constants may be separated by one or more space characters Zero or more space characters may be used for separating the angle brackets lt gt from the numbers EXAMPLE The following is an example of an input file used to create the ASCII code set definition table on a file named ascii chrclass ascii isupper 0x41 OxSa islower 0x61 Ox7a isdigit 0x30 0x39 isspace 0x20 0x9 Oxd ispunct 0x21 0x2f Ox3a 0x40 V OxSb 0x60 0x7b Ox7e iscntrl 0x0 Oxlf Ox7f INTERACTIVE UNIX System 2 International Supplement chrtbl 1M chrtbl 1M isblank 0x20 isxdigit 0x30 0x39 0x61 0x66 V 0x41 0x46 ul 0x41 0 61 gt 0x42 0x62 0x43 0x63 0x44 0x64 0x45 0x65 0x46 0x66 lt 0 47 0x67 0x48 0x68 0x49 0x69 lt 0 4 gt lt 0 4 Ox6b Ox4c 0 gt lt 0x4d 0x6d gt lt 0 4 Ox6e gt lt 0 4 Ox6f gt lt 0 50 0 70 gt lt 0 51 0 71 gt lt 0 52 0 72 gt lt 0 53 0 73 gt lt 0 54 0 74 gt lt 0 55 0 75 gt lt 0 56 0 76 gt lt 0 57 0 77 gt lt 0 58 0 78 gt lt 0 59 0 79 gt lt 0 5 0 7 gt aa FILES lib chrclass data file containing character classification and conversion tables created by chrtbl usr include ctype h header file containing information used by character classification and conversion routines SEE ALSO ctype 3C 5 in the INTERACTIVE SDS G
83. ac ters defined in the ASCII charmap can be used in the name escape char The escape character is used to indicate that the characters following will be interpreted in a special way as defined later The default is the backslash X character X comment char The comment character is used to indicate that the characters following on the line constitute com ment and will be ignored The default is the character mb cur gt The maximum number of bytes in a character in the regular charmap The default value which is the only value permitted in the INTERACTIVE UNIX System is 1 mb cur min The minimum number of bytes in a character in the regular charmap The value cannot exceed the value of mb cur max CHARMAP The charmap starts with an identifier line containing the string CHARMAP starting in column 1 and ends with a trailer line contain ing the string END CHARMAP starting in column 1 Empty lines and lines containing a in the first column are ignored Each noncom ment line of the character set mapping definition i e between the CHARMAP and END CHARMAP lines of the file is in the form lt symbolic name gt encoding A symbolic name is one or more characters from the set defined in the ASCII charmap enclosed between angle brackets A character follow ing an escape character is interpreted as itself for example the sequence lt gt gt represents the symbolic name gt
84. aditional UNIX System environment Programs can modify this environment by using the setlocale 3P function If so directed by the program the values of the above environment variables will be used to set the environment The value assigned to the environment variable LC_ALL if set will be used for all locale categories LC_ALL is primarily intended for use when a user wishes to make sure that a particular program is executed with one locale only i e no mixed locales The value assigned to the environment variable LANG will be used as the value for any of the above variables for which no valid value is assigned If LANG is set to a valid value and none of the above vari ables are set then the entire environment will be set to the value indi cated by LANG INTERACTIVE UNIX System 1 International Supplement locale 5 locale 5 The information that defines a specific locale must be stored in data files on the system The information for each category is stored in a file with a name corresponding to the environment variable name The default location is within a directory under lib locale ISC The name of the directory is the name of the locale N locale 1 locale 1 1 1 2 locale 3 1 LC CTYPE LC COLLATE LC TIME LC NUMERIC LC MONETARY LC MESSAGES Creating a Locale The following steps are used to create the locale information Locales installed under lib locale ISC should be viewed as public
85. ain data from the archive in the case that the file name on the archive is invalid for the system on which the file hierarchy is being created Answer Format Extended tar cpio Options File The archive reading utility relies on standard file and directory creating system interfaces to create files and directories On extraction from the archive the only case where a filename would be changed is if a pathname component exceeds the system filename length limit of NAME MAX 14 characters in which case it would be truncated to NAME MAX characters The archive reading utility handles invalid file and directory names in the same manner as extended tar 1 Definition of the file name used 2 None if the file is not stored on the archive 3 Refer to POSIX 1 Conformance Document Sections 10 1 1 and 10 1 2 2 Rationale Because an archive can contain non portable file names it is neces sary for an archive reading utility to be able to generate a file and store the data associated with a non portable file name when this is encountered on the archive There may be a need to generate a number of such file names in the same directory and the specification should detail the algorithm used to generate these file names Page 15 1 5 X Open Conformance Statement XCS QUE 3 2 Questionnaire Reference XPG3 Volume 3 Page 151 Utilities MULTI VOLUME ARCHIVES Question 5 How does the archive reading utility dete
86. alogue The n denotes the set number 1 INL SETMAXI Any string following the set number is treated as a comment comment A line beginning with followed by an ASCII space or tab character is treated as a comment m message text The m denotes the message identifier which is defined as a number in the range 1 NL MSGMAX The message text is stored in the message catalogue with the set identifier specified by the last Sset directive and with message identifier m If the message text is empty and an ASCII space or tab field separator is present an empty string is stored in the mes sage catalogue If a message source line has a message number but neither a field separator nor message text the existing message with that number if any is deleted from the catalogue Message identifiers must be in ascending order within a single set but need not be contiguous The length of message text must be in the range 0 NL TEXTMAX INTERACTIVE UNIX System 1 International Supplement gencat 4P gencat 4P quote c This specifies an optional quote character c which can be used to surround message text so that trailing spaces or null empty messages are visible in a message source line By default or if an empty S quote directive is supplied no quoting of message text will be recognised din Empty lines in a message text file are ignored The effects of lines starting with any character other than those defined above are imple
87. an the pre vious question requiring the semantic differences associated with the commands to be specified Again the question relates to the basic utilities rather than the development utilities The question only relates to the semantics of the options specified within the XPG implementation specific extensions should not be documented Page 3 1 2 X Open Conformance Statement XCS QUE 3 2 Questionnaire Section 3 2 Development Utilities 3 2 1 Supported Commands Question 3 Which of the development utilities defined in the XPG are not provided with the implementation Answer are provided Options 1 provided 2 None are provided 3 A list of utilities that are not provided Rationale The XPG Volume states that The development utilities might not be present in all X Open compliant systems in designated DEVELOPMENT systems all of the development utilities must be present and must conform to the published definition Reference XPG3 Volume 1 Page 2 Status of Interfaces Page 3 2 1 X Open Conformance Statement XCS QUE 3 2 Questionnaire 3 2 2 Command Behaviour Question 4 In what ways do the development utilities provided by the implementation behave differently from the specifications contained in the XPG Answer Command Option Description compiles and links for the POSIX and XPG3 environments mailx does not support internationalised behavior Options 1 The develo
88. and the ALT key are pressed simultaneously 4 The key the SHIFT and the ALT keys are pressed simultaneously INTERACTIVE UNIX System 3 International Supplement ttymap 1 ttymap 1 For each of these cases the scancode can be translated into one of the following a single byte a single byte preceded by ESC N a single byte preceded by ESC O a single byte preceded by ESC Internally special bits are set to indicate that an escape sequence needs to be generated Other bits are used to indicate whether the translated code should be influenced by some special keys NUM LOCK If the NUM LOCK bit is set the regular and SHIFT values are swapped as are the ALT and SHIFT ALT values whenever the NUM LOCK LED is on By default only the keys on the numeric keypad have this bit set That is why these keys gen erate 7 8 9 etc when the NUM LOCK LED is on which is the same value that would be produced if SHIFT were used with these keys CAPS LOCK This has the same effect as the NUM LOCK key By default this bit is set for all letters and not set for punctuation signs CTRL When a key is translated into a single byte no escape sequence and this bit is set the corresponding control charac ter will be generated when the CTRL key is pressed simultane ously This is equally valid for the SHIFT ALT and SHIFT ALT combination When this bit is not used the CTRL key combination will not generate anything mapfiles This section
89. appears on the screen 44 International Supplement User s Manual escape sequence Sequences of characters such as escape the code generated by the escape key IBM International Business Machines IEEE Institute of Electrical and Electronics Engineers Inc internationalisation Making a computer a computer system or a com puter program function appropriately in a non U S environment ISO The international standards organisation Note that ISO is not an acronym ISV Independent Software Vendor I18N Internationalisation L10N Localisation locale An abbreviation for the X Open concept local environment that subset of the user s environment that depends on language and cultural conventions It consists of the following categories Date and Time Format Character Classification Collation Numeric and Monetary Formatting Yes No Responses and Message Catalogues localisation The adaptation of computer programs to a single language and or country output mapping POSIX POSIX 1 POSIX 2 Modification of the code sent by the system or the application to the screen before a character is displayed Portable Operating System Interface for Computer Environments International Standard ISO IEC 9945 1 defining system interfaces Draft standard for shell and utilities International Supplement User s Manual 45 QWERTY QWERTZ RAM ROM SVID terminal trigraph XPG3 XSI Name
90. are contained in a byte sized array encoded such that a table lookup can be used to determine the character classification of a character or to convert a character see ctype 3C The size of the array is 257 2 bytes 257 bytes are required for the 8 bit code set character classification table and 257 bytes for the uppercase to lowercase and lowercase to uppercase conversion table chrtbl reads the user defined character classification and conversion information from file and creates two output files in the current direc tory One output file ctype c a C language source file contains the 257 2 byte array generated from processing the information from file You should review the content of ctype c to verify that the array is set up as you had planned In addition an application program could use ctype c The first 257 bytes of the array in ctype c are used for char acter classification The characters used for initialising these bytes of the array represent character classifications that are defined in usr include ctype h for example _L means a character is lower case and S B means the character is both a spacing character and a blank The last 257 bytes of the array are used for character conver sion These bytes of the array are initialised so that characters for which you do not provide conversion information will be converted to themselves When you do provide conversion information the first value of the pair is stored where the s
91. are sorted 3 Multiple weights and equivalence classes For many languages the basic ordering is sufficient but others require more complex rules For example in German the and the o collate as the same character but if two words are equal except for the o and the then the word with o comes first In French all accented letters collate equally with the base character if the words are equal there is a defined secondary ordering among these characters All characters or collating elements that initially collate equally are said to belong to an equivalence class Such characters typically have more than one weight The first primary weight is that of the equivalence class the second weight is determined by their relative order The INTERACTIVE UNIX System supports up to wEIGHTS MAX defined as 4 in usr include sys limits nh different weights for each character or collating element International Supplement Manual for Advanced Users 21 4 One to many mapping A single character is mapped into a string of collating ele ments An example of this is the German which collates as ss 5 Many to many substitution A string is substituted for another string of one or more char acters The string that is substituted can be an empty string In other words the character or characters are ignored for collation purposes 6 Ordering by weights To determine their relative order two strings are first com
92. as 007 When strings are needed a list of character representations should be used Quoted strings will be supported in the future The following paragraphs describe what goes in each section Input section The input section describes which input characters should be mapped into a single byte A very small sample input section could be input A A into on input P Ox9c map sharp sign into pound sign Toggle section The toggle section is a one line section that defines which key is to tog gle between mapping and no mapping For example toggle 9459 y ctrl y is the toggle key Deadkey section The deadkey section defines which keys should be treated as deadkeys A dead keyword followed by the specification of the character appears in this section for each deadkey The subsequent lines describe what key should be generated for each key following the deadkey A dead key followed by a key not described in this part of the mapfile will not generate any key and a beep tone will be produced on the terminal For example dead circumflex is a deadkey Ae circumflex followed by space generates circumflex e 0x88 circumflex followed by e generates e circumflex dead double quote used as a deadkey BO double quote space generates double quote a 0x84 double quote a generates an umlaut INTERACTIVE UNIX System 5 International Supplement ttymap 1 ttymap 1 Compose section The f
93. at is the first argument is applied to the first conversion specification the second argument to the second format specification and so on However the conver sions can be applied to the mth argument in the argument list rather than to the next unused one if the conversion character is replaced by the sequence digit where digit is a decimal integer n the range between 1 and INL ARGMAX J defined in the include file limits h giving the position of the argument in the argu ment list For example printf X1 s 2 s n adjective noun In format strings containing the digit form of a conversion specification a field width or precision may be indicated by the sequence xdigit where digit is a decimal integer in the range between 1 and NL ARGMAX giving the position of the argument containing the field width or precision For example printf 1 d 2 3 d 4 3 d n hour min precision sec The format string can contain either numbered argument specifications digit and digit or unnumbered argument specifications but not both When numbered argument specifications are used specifying the nth argument requires that all the leading arguments from the first to the 1 be specified in the format string p Open Conformance Statement Questionnaire X Open Portability Guide 3 Completed by INTERACTIVE Systems Corporation September 1991 f Document Revision Number 3 2 X
94. ation with SHIFT F25 to F36 when used with CTRL and F37 to F48 when used with CTRL CRI and SHIFT together F49 to F60 are the keys on the numeric keypad On the console it is more flexible to change the scancode translation than to use the general mapping features described earlier It also reduces the risk of reaching the 1K limit of the mapping buffer ttymap 1 describes how the desired mapping should be laid out in a mapfile 2 3 1 A Sample mapfile Consider the following input to the ttymap program sample file input toggle 0x14 CTRL SHIFT F2 dead circumflex eh Ons lt circumflex gt e 0x88 lt e circumflex gt compose key compose 0x18 CTRL SHIFT F1 ve 0x89 lt e diaeresis gt output Ope Scancodes map CTRL SHIFT F1 to be 0x18 for the compose character key 7 0 18 map CTRL SHIFT F2 to be 0 14 for the toggle key F38 0x14 This file defines the compose and toggle keys two deadkey sequences one compose sequence and KILL as the string to be displayed whenever 0 is sent to the output Assuming this file is named mapfile this mapping could be activated by typing ttymap mapfile The terminal currently in use will then behave according to the mapping described This has its drawbacks however for users with a French keyboard For example if a user with the login name paul can only use the keyboard corr
95. ationalised system should have are discussed here The X Open Portability Guide dedicates 7 chapters to international isation see Volume 3 XSI Supplementary Definitions chapters 2 8 describing these features The INTERACTIVE UNIX Operating System supports all the features described there The abilities described allow developers to create internationalised applications and users to take advantage of the fact that these applications are indeed internationalised An internationalised application is a program that makes no hard coded assumptions about the language the local customs or the coded character set When the proper environment is set up for the user of that application a program that displays the date displays it according to the local custom a program that sorts takes into account the natural order of letters and so on The international environment is used to define user preferences and internationalised utilities and features adapt their behaviour to those preferences even when they change A default environment is often established but the user is always free to change the environ ment as required The remainder of this section describes the international environ ment how it is set up and how it interacts with internationalised utilities and applications 7 1 The International Environment Running applications in an internationalised environment is based on the concept of a local environment or locale wh
96. character collating ele ments such as the Spanish ch these must be specified via a collating element keyword The syntax is collating element symbol from string The symbol operand must be a string of one or more characters enclosed between angle brackets gt which cannot duplicate any symbolic name in the current charmap file or any other sym bolic name defined in this collation definition The string operand is a string of two or more characters to be collated as an entity For example collating element ch from lt c gt lt h gt collating element lt ss gt from ss 5 5 2 collating symbol Keyword In addition to characters and multicharacter collating elements you can also define special symbols for use in collation sequence state ments that is between the order start and the order end keywords Such a symbol does not have any character associated with it as the charmap symbols do However placing such a symbol in the collating sequence assigns to it a relative order that can be used in other collation collating element specifications The syntax is collating symbol symbol The symbol is a string of one or more characters surrounded by angle brackets which must not duplicate any symbolic name in the current charmap file or any other symbolic name defined in this collation definition For example collating symbol UPPER CASE collating symbol LOWER CASE collating symbol NO ACCENT collating symbol
97. cpio Y csplit Y date ed Y egrep Y Y Y expr Y 40 International Supplement User s Manual Categories Utility Int LC_ LC_ LC_ NUME MESS LC_ LC_ fgrep find grep join lIn lpstat 15 mail mv Pg pr ps red rm rsh sed sh sort tar tr uniq uucp uustat uux wc who yacc MO KK K K KKKK KMK KMKK lt KKK KKKK K For awk the period is used as the decimal delimiter in scripts to provide portability but in data to be processed as well as out put the decimal delimiter of the current Locale is honored ar and yacc are supplied with the INTERACTIVE Software Development System rather than the International Supplement In addition to the functionality specified by XPG3 other uucp related commands have been changed so that they are affected by the category LC_TIME in the 1 1 One of these commands International Supplement User s Manual 41 uux is included in XPG3 the remainder are not They may be found in the INTERACTIVE UNIX System User s System Administrator s Reference Manual The following is a summary of the additional functionality uucico uusched uux uuxqt LC TIME determines the format of date and time strings output by these commands uucleanup LC TIME affects the format of date strings included in messages composed by uucleanup J International Supplement User s Manual 43 GLOSSARY announcement mechanism ANSI ASCII AZ
98. cters Using Compose Sequences Although assigning deadkeys supports more characters than the ones printed on the keyboard it has its disadvantages As illustrated above it is annoying when one needs the specific character alone that has been assigned as a deadkey Instead of one keystroke two keystrokes are needed to access that character If too many keys act as deadkeys the system is difficult for everyone to use Fortunately another method exists often referred to as compose sequences A special key or sequence of keys is used to put the key board into a special mode We will call the key or key sequence the COMPOSE key and the special mode the COMPOSE mode The key sequence for the INTERACTIVE UNIX Operating System is F1 Many MS DOS DOS users will be familiar with it When in COMPOSE mode the sys tem expects two more characters to type the user before a character is generated Press CTRL cru iE il followed by n to produce the Spanish fi the n in mafiana on the screen If you press the key sequence followed by pressing twice an inverted exclamation sign appears on the screen Both the value of the key and the list of key sequences and the characters they generate can be specified in a file that is then processed by the ttymap command Refer to the International Supplement Manual for Advanced Users or ttymap 1 for more details Some terminals for example the DEC VT220 have a dedicated COMPOSE ke
99. ction key through the termcap or terminfo interface These interfaces allow the development of terminal independent applications The layout of both the numeric keypad section and the function key section of the keyboard is the same regardless of the country in which a specific keyboard is used 4 1 U S Personal Computer Keyboard Layout The central section of a keyboard designed for use in the United States contains keys for all letters of the English alphabet all digits and the most commonly used punctuation characters and special symbols Some of these symbols the slash for example are especially important when using the INTERACTIVE UNIX Operat ing System In addition a few special modifier keys are present The SHIFT key when pressed simultaneously with a letter key generates an uppercase character instead of a lowercase character or alternate symbols instead of the numbers and symbols on the top TOW The key exchanges uppercase and lowercase In other words when this key is pressed it changes the state of the keyboard so that all characters subsequently typed are automatically upper case and only appear in lowercase when pressed together with the key A CAPS LOCK light indicates the status of the key board The spacebar generates a space character to put one or more spaces between words Other special keys are TAB ALT ENTER and To learn more about the meaning of these keys refer to the INTERACTIVE UNIX O
100. d to as IBM extended ASCII has been used for several years MS DOS users are quite familiar with that In heterogeneous UNIX System environments a different codeset called ISO8859 has been promoted In both codesets charac ters found in the ASCII codeset are represented in the same way The other 128 characters are encoded differently however and some char acters found in one codeset will be missing in the other The INTERACTIVE UNIX Operating System supports both codesets actu ally it supports any 8 bit one byte codeset To be able to use characters from the French German Finnish and other alphabets several terminals are available on the market that gen erate 7 bit codes but display the above mentioned characters on the screen instead of the ones found on a US terminal On the keyboard there are an equal number of keys but there are different characters on the key caps Others such as a DEC VT220 will support 256 different characters at a time but use their own proprietary codesets Assume you are using the INTERACTIVE UNIX Operating System with a console and a French 7 bit terminal connected to the serial port If you edit a file on the terminal and use the French character in INTERACTIVE UNIX System 1 International Supplement ttymap 1 ttymap 1 text the terminal will actually generate the ASCII code 123 which is the code normally used for the left curly brace If you look at the edited file on the console the le
101. diately follows the last collation order entry END LC COLLATE This keyword must be the last in the file Each collation order entry consists of a character a collating symbol or a multi character collating element followed by weight information The detail format of the collation definition source is described in the International Supplement User s Manual The setting of the LC environment variables does not affect the behaviour of the colldef command ERRORS FILES If an error is detected no collation tables are created If warnings occur specifying the c option will cause permanent out put to be created The following conditions will cause warning mes sages to be issued l If a symbolic name not found in the charmap file is used to define a collating element the element is discarded and a warn ing message issued 2 If the number of arguments to the order keyword exceeds the COLL WEIGHTS MAX limit which is defined in the file usr include sys limits h a warning message will be issued lib locale ISC LC_COLLATE lib charmap SEE ALSO strcoll 3P strxform 3P charmap 5P locale 5P International Supplement User s Manual INTERACTIVE UNIX System 2 International Supplement gencat 1P gencat 1P NAME gencat generate a formatted message catalogue SYNOPSIS gencat c catfile msgfile DESCRIPTION The gencat utility merges the message text source file s msgf
102. ds on the number in the file name We recommend using a number greater than all the others for the script that changes the font The directory also contains files with names that begin with the letter K these are executed when the system is switched back to single user mode For example this directory might contain K36sendmail SO6TMPRAMD S21perf S01MOUNTFSYS S11uname S70uucp SOSRMTMPFILES S20sysetup s95font International Supplement Manual for Advanced Users 7 2 4 3 Other Terminals When the system is booted a getty program is started on every terminal that is configured in the system This program prints login or any other herald on the screen and waits until some one types input It then calls the 1ogin program for password verification which in turn executes the user s login program which is typically the UNIX System command interpreter the shell Each such terminal is represented by one line in the system file etc inittab By modifying such a line mapping can be activated prior to logging in on any terminal For example a line for the console would be co 12345 respawn etc getty m usr lib keyboard 437 en US console console To activate mapping on another terminal simply add the m option followed by the name of the appropriate mapping file to the getty command on the line representing the terminal Most terminal de vices have a name that contains the string tty For example 00 2345 0ff etc getty dev tty00 96
103. e locales other than the default US English the C locale It also provides the enhanced UNIX System utilities that understand the X Open announcement mechanism discussed below Refer to the International Supplement Manual for Advanced Users the International Supplement Reference Manual and Volumes 1 2 and 3 of the X Open Portability Guide for more details International Supplement User s Manual 37 9 THE SYSTEM V ENVIRONMENT Beginning with UNIX System V Release 3 1 serious attempts were made to make the UNIX Operating System function better in an international environment Most UNIX System utilities that stripped the eighth bit of a byte were made 8 bit clean In addition some of the functionality described in the previous section was made available in particular date and time formats and character classification In order to access the local language information a utility or application needs to know its location The mechanism used to communicate its location is called an announcement mechanism Unfortunately the System V and X Open announce ment mechanisms are different The System V mechanism is described in this section because certain UNIX System utilities such as vi support it 9 1 Date and Time Formats Most UNIX System utilities that display the time or the date date and 1s for example and all applications developed on UNIX System V that use the c time function see ctime 3C can be give
104. e ordering statements 30 International Supplement Manual for Advanced Users 5 5 6 An Example Notes 1 elements 2 The 3 LC COLLATE collating element ch from lt c gt lt h gt collating element lt ss gt from ss collating symbol UPPER CASE collating symbol XLOWER CASE collating symbol NO ACCENT collating symbol GRAVE collating symbol ACUTE substitute Mc with Mac order start forward backward forward lt UPPER_CASE gt lt LOWER_CASE gt lt NO_ACCENT gt lt GRAVE gt lt ACUTE gt lt space gt lt A gt lt a gt lt a acute gt lt a grave gt lt B gt lt b gt lt c gt lt C cedilla gt lt c gt lt ce cedilla gt lt ch gt lt s gt lt s gt lt ss gt lt sharp s gt UNDEFINED order end END LC COLLATE IGNORE IGNORE IGNORE lt A gt lt UPPER_CASE gt lt NO_ACCENT gt lt A gt lt LOWER_CASE gt lt NO_ACCENT gt lt A gt lt LOWER_CASE gt lt ACUTE gt lt A gt lt LOWER_CASE gt lt GRAVE gt lt C gt lt C gt lt C gt lt gt lt gt lt 111 gt lt gt lt gt lt gt lt gt lt gt lt 111 gt lt ch gt lt ch gt lt ch gt lt S gt lt S gt lt S gt lt S gt lt s gt lt s gt lt S gt lt S gt lt s gt lt s gt lt s gt lt s gt lt S gt lt S gt lt s gt lt s gt lt s gt lt s gt IGNORE IGNORE IGNORE See Note 1 See Note 2 See Note 3
105. econd one would be stored nor mally and vice versa for example if you provide 0x41 0x61 then 0x61 is stored where 0x41 would be stored normally and 0x61 is stored where 0x41 would be stored normally The second output file a data file contains the same information but is structured for efficient use by the character classification and conver sion routines see ctype 3C The name of this output file is the value of the character classification chrelass read in from file This output file must be installed in the lib chrclass directory under this name by someone who is superuser or a member of group bin This file must be readable by user group and other no other permissions should be set To use the character classification and conversion tables on this file set the environmental variable CHRCLASS see environ 5 to the name of this file and export the variable for example if the name of this file and character class is xyz you should issue the commands CHRCLASS xyz export CHRCLASS If no input file is given or if the argument is encountered chrtbl reads from the standard input file The syntax of file allows the user to define the name of the data file created by chrtbl the assignment of characters to character classifications and the relationship between uppercase and lowercase letters The character classifications recognised by chrtbl are INTERACTIVE UNIX System 1 International Supplement chrtbl 1M
106. ectly after typing this com mand he is then forced to type pqu1 to log in to the system has have chosen a password that can still be typed in and has to type 6 International Supplement Manual for Advanced Users ttyiqP 1 to access the ttymap command itself To avoid this awkward situation INTERACTIVE has enhanced the getty command to activate the mapping prior to login A new option m has been added Refer to section 2 4 and getty 1M for details 2 4 Activating Mapping Prior to Login 2 4 1 The System Console When the INTERACTIVE UNIX System is installed the system asks for keyboard information This automatically configures the system for the proper mapping on the console for the keyboard selected providing IBM codepage 437 is used 2 4 2 Changing the Default Font for the Console When the system is booted IBM codepage 437 is automatically used on the console The system can be configured to automatically use a different font without the need for any additional commands from the user To do this create a shell script with a name that starts with S and a number for example S95font with the appropriate loadfont command replacing the one in this example set the appropriate loadfont usr bin loadfont 8859 Place this file in the directory etc rc2 d which contains a number of shell scripts that are automatically executed when the system comes up in multi user mode The order of execution depen
107. ent User s Manual 6 DISPLAYING DATA When characters are displayed on the screen of your terminal or console these characters physically consist of a set of white dots that make up the picture of the character Typically a rectangle of 8 by 16 dots is reserved for every character The one to one rela tionship between a character actually the numeric representation of a character and its picture is called a font Depending on how the INTERACTIVE UNIX System is used fonts may or may not be modified After typing a character and possibly storing that character in file a code usually the same as the input code is sent to the terminal to indicate that it should display something If necessary the code sent by the system or the application can be modified before it is sent to the screen This practice is called output mapping Again ttymap is the utility responsible for this function Proper output mapping and possible modification of the font guarantees the display of the proper character or when the actual character cannot be displayed at least something that makes sense Here are a number of suggestions for making the INTERACTIVE UNIX System work correctly 6 1 7 bit Terminals When 7 bit character terminals are used a 128 character font that is hardcoded inside the terminal hardware is used This font cannot be modified but more sophisticated terminals allow access to several different fonts one for each language supported Th
108. ent as defined by the first condition met below 1 If LC ALL is defined in the environment and is not null the value of LC ALL is used 2 If there is a variable defined in the environment with the same name as the category and that is not null the value specified by that environment variable is used INTERACTIVE UNIX System 1 International Supplement setlocale 3P setlocale 3P 3 If LANG is defined in the environment and is not null the value of LANG is used If the resulting value is a supported locale setlocale sets the specified category of the program s locale to that value and returns the value specified below If the value does not name a supported locale and is not null setlocale returns a NULL pointer and the program s locale is not changed by this function call If no non null environment variable is present to supply a value setlocale sets the specified category of the program s locale to the def ault locale see aber Setting all of the categories of the program s locale is similar to suc cessively setting each individual category of the program s locale except that all error checking is done before any actions are per formed To set all categories of the program s locale setlocale is invoked as setlocale LC_ALL In this case setlocale first verifies that the values of all environment variables it needs according to the precedence above indicate supported locales If the value of any
109. er is printable LC TIME affects date format Yes mail LC TIME affects date format Yes mailx LC COLLATE LC CTYPE affect file name No pattern matching LC TIME affects date format No pg LC COLLATE LC CTYPE affect filename Yes pattern matching pr LC TIME affects date format Yes LC CTYPE is used to determine whether a Yes character is printable ps LC TIME affects date format Yes rm rmdir LANG affects yes string Yes sed LC COLLATE LC CTYPE affect regular Yes expression matching LC CTYPE is used to determine whether a Yes character is printable sh LC COLLATE LC CTYPE affect filename Yes pattern matching LC CTYPE is used to determine whether a Yes character is alphabetic Page 3 3 3 X Open Conformance Statement XCS QUE 3 2 Questionnaire Command Behaviour Specified in XPG3 Supported sort LC COLLATE affects sorting sequence Yes LC_CTYPE affects character classification Yes alphabetic uppercase printing LC_NUMERIC affects the determination of Yes the radix character tar LC_TIME affects date format Yes LANG affects yes string Yes tr LC_COLLATE LC CTYPE affect bracketed Yes expressions LC_CTYPE affects the definition of the Yes character universe uniq LC COLLATE affects sorting sequence Yes uucp LC TIME affects date format Yes uustat LC TIME affects date format Yes wc LC_CTYPE is used to determine white space Yes characters who LC_TIME affects date format Yes yacc LC_CTYPE is used to determine character Yes class
110. ers and Cod sets IBM Codepages ISO Codesets 7 bit Codesets WWW N e 10 5 6 Choosing and Configuring a Codeset 5 6 1 Converting From One Codeset to Another so res d DISPLAYING DATA 6 1 7 bit Terminals 6 2 The Console 6 3 Displaying Data and Using INTERACTIVE X11 lt gt THE INTERNATIONAL ENVIRONMENT 7 1 The International Environment 7 2 Controlling the International Environment INTERNATIONALISED BEHAVIOUR 8 1 Date and Time Format 8 2 Character Classification 8 3 Collation 8 3 1 Example 8 4 Numeric and Monetary Formatting 8 5 Yes No Responses 8 6 Message Catalogues 8 7 The X Open Environment THE SYSTEM V ENVIRONMENT 9 Date and Time Formats 9 2 Character Classification INTERNATIONALISED INTERACTIVE UNIX SYSTEM UTILITIES TEC M Ge GLOSSARY l International Supplement User s Manual 1 INTRODUCTION This document explains the internationalisation features of the INTERACTIVE UNIX Operating System and describes how to use it on computer systems outside the United States U S where there are differences in local language customs and standards This document focuses on usability and is restricted to those areas where languages are spoken that use an alphabet that contains fewer than one hundred letters Korean Japanese Chinese and other languages with thousands of different letters are not supported by the standard INTERACTIVE UNIX Operating System In ce
111. ese terminals support the ISO 646 ASCII variants described in the previous sec tion To ensure consistency throughout the system assuming a French 7 bit terminal is used e On input map the 7 bit code generated for the French charac ters into their actual 8 bit value e On output map the 8 bit code back to the 7 bit code to display the correct French character e Use trigraphs for ANSI C programming e To generate curly braces and other such characters use the decimal representation On output map to a space character This ensures the proper display of the file used especially when the same file is later edited on devices such as the console International Supplement User s Manual 27 If the inability to display curly braces and other typical UNIX Sys tem characters such as V is too annoying use this alternative approach e Use the SETUP key of the terminal to switch it to US English You now have access to a U S ASCII font but still have a French keyboard layout e When a French character key is pressed it is mapped and stored using its correct 8 bit value e On output it is mapped to the corresponding character without the accent or the closest looking English letter for example a c instead of a e Use decimal representation for the UNIX System characters which are automatically stored as 7 bit characters and displayed correctly Your system administrator should develop the correct ttymap descript
112. eturns a string with the value requested Refer to nl Janginfo 3P for more information 8 4 Character Classification At a minimum use the following statement in your program setlocale LC_CTYPE Make sure you also use the family of toupper isupper and similar functions No further changes have to be made to the pro gram Refer to ctype 3C for more information 8 5 Collation There are two functions for handling international sorting Strcoll and strxfrm They are also part of the ANSI C stan dard They differ from the traditional strcmp in that they use the sorting rules defined in a given locale rather than using the internal byte representation inside the computer At a minimum the following statement should be part of the program setlocale LC COLLATE Strcoll is very similar to strcmp but is slower than the older function since it is table driven strxfrm is a different type of function in that it transforms the data it gets and returns a string of characters that can be given to strcmp to be sorted It is useful when performance is an issue and the same set of data needs to be compared several times Refer to strcoll 3P strxfrm 3P and string 3P for more information 8 6 Regular Expressions Programs have access to internationalised regular expressions when they are compiled with the Xp option and include the following statements in the program define XOPEN SOURCE include lt regexp h gt 8 7
113. file and its exit status will be 1 Under certain error conditions gencat will continue process ing all msgfiles before exiting with an error status These conditions include l If catfile exists either it cannot be opened there is an error reading it or it has corrupted data ff 2 For any msgfile either it cannot be opened or it has a syntax error For any other errors exit will be immediate INTERACTIVE UNIX System 1 International Supplement gencat 1P gencat 1P WARNINGS The following conditions will not generate an error but will cause a warning message to be printed 1 There is an attempt to delete a message or set that doesn t exist 2 The specified catfile is an empty file 3 A temporary file cannot be unlinked NOTES Using non contiguous set or message numbers using a set number other than 1 as the first set or using a message number other than 1 as the first message of a set will cause the size of catfile to be larger than using only contiguous numbers starting with 1 Message catalogues produced by gencat are binary encoded which means that their portability cannot be guaranteed between different types of machines Thus just as C programs need to be recompiled for each type of machine so message catalogues must be recreated via gencat SEE ALSO showcat 1P gencat 4P NOTE TO USERS This entry is reprinted from the INTERACTIVE UNIX System User s System Administrator s Reference Manual I
114. for Greek and for the Slavic languages as well Try running the program from the previ ous section again but showing codepage 850 instead Type loadfont 850 The screen will flash and the shell prompt will reappear Now the console is using a different codeset Notice the differences between the output of the command and the previous output To switch back type loadfont 437 5 4 ISO Codesets The organization that sets international standards called ZSO has also defined 8 bit codesets to be used on computer systems in different territories This standard is more widely adopted on larger computer systems running the UNIX Operating System This fam ily of codesets is referred to as the ISO 8859 standard The codeset used in Western Europe is the 8859 1 codeset which is the standard adopted by the X Open Company for information interchange Type loadfont 8859 and run the show program again The following can be observed e There is no symbol for the first 32 values of the second 128 numbers e There no graphics characters to draw boxes e The difference between the values of an uppercase character and a lowercase character is always constant 32 24 International Supplement User s Manual e The values chosen for the accented characters are different from IBM codepage 437 for example is represented by 234 in ISO 8859 1 and by 134 in IBM codepage 437 To switch back type loadfont 437 There are 9 different 8
115. function transforms the string pointed to by s2 and places the resulting string into the array pointed to by s The transforma tion is such that if the strcmp see string 3P or memcmp see memory 3C functions are applied to the two transformed strings it returns a value greater than equal to or less than zero corresponding to the result of the strcoll 3P function applied to the same two origi nal strings based on the collating sequence information in the program s locale category LC_COLLATE see ocale 5P No more than n characters are placed into the resulting array pointed to s1 including the terminating null character If m is zero s is permitted to be a null pointer If copying takes place between objects that over lap the behaviour is undefined RETURN VALUE The strxfrm function returns the length of the transformed string not including the terminating null character If the value returned is n or more the contents of the array pointed to by s are indeterminate The strxfrm function returns size_t 1 on error and sets errno to indicate the error ERRORS The strxfrm function may fail if EINVAL The s or s2 argument contains characters outside the domain of the collating sequence SEE ALSO strcoll 3P locale 5P memory 3C string 3P in the INTERACTIVE SDS Guide and Programmer s Reference Manual NOTE TO USERS This entry is reprinted from the INTERACTIVE SDS Guide and Programmer s Reference Manual
116. ge 2 1 4 X Open Conformance Statement Questionnaire Rationale XCS QUE 3 2 Each of these limits can vary within bounds set by the X Open Portability Guide The minimum value that a limit can take on any X Open conforming system is given in the corresponding POSIX value A specific conforming implementation may provide a higher minimum value than this and the maximum value that it provides can differ from the minimum Some conforming implementations may provide a potentially infinite value as the maximum in which case the value is considered to be indeterminate The minimum value must always be definitive since the _POSIX_ value provides a known lower bound for the range of possible values Reference XPG3 Volume 2 Page 538 lt limits h gt Question 4 What are the values associated with the following constants specified in the limits h header file Answer Macro Name CHAR LONG WORD DBL DIG DBL_MAX Meaning Number of bits in a char Number of bits in a long Number of bits in a word Digits of precision of a double Maximum decimal value of a double Value 32 32 15 1 797693 1348623 157e 308 Page 2 1 5 Conformance Statement XCS QUE 3 2 Questionnaire FLT DIG Digits of precision 6 of a float FLT MAX Maximum decimal 3 4028234663852885 38 value of a float Rationale This set of constants provides usefu
117. gement e Networking Services 3 4 Standard Portable Operating System Interface POSIX 1 Volume 2 of the X Open Portability Guide XSI System Interfaces and Headers is a superset of the POSIX 1 Standard published by the Institute of Electrical and Electronics Engineers Inc IEEE POSIX 1 stands for the Standard Portable Operating System Inter face for Computer Environments This standard defines a standard operating system interface and environment based on the UNIX Operating System documentation to support application portability International Supplement User s Manual 5 at the source level This is the first of a group of proposed stan dards known colloquially and collectively as POSIX It is a superset of the system interfaces of the UNIX Operating System XSI also adds a number of interfaces particularly in the area of interna tionalisation which go beyond both the SVID and POSIX 1 3 5 POSIX 2 Volume 1 of XPG3 XSI Commands and Utilities is based on the SVID which means that the utilities have the same names and features as the standard utilities supplied with the UNIX System with some additional utilities However when used in an interna tional environment many of these utilities exhibit additional behaviour based on the draft POSIX 2 Standard The latter describes how the command interpreter and the utilities of the operating system should work and interface with the user it is expected to become an official standard
118. haracters or symbols indicate one to many mapping The special symbol IGNORE means that this character is to be ignored at the defined weight level for collation purposes For example if the dash is IGNOREd then the two strings co ordinate and coordinate collate as equals In regular expressions such characters are never ignored Ranges are based on the order in which elements are listed in the definition basic character ordering sequence and all charac ters are explicitly or implicitly listed All characters specified via an ellipsis are assigned unique weights and are ordered according to their coded character set values Characters specified via an explicit or implicit UNDEFINED special symbol are by default assigned the same primary weight that is they belong to the same equivalence class An ellipsis symbol as a weight is interpreted to mean that each character in the sequence must have unique weights equal to the relative order of the charac ter in the character collation sequence Secondary and subsequent weights have unique values The use of the ellipsis as a weight is treated as an error if the collating element is neither an ellipsis nor the special symbol UNDEFINED An empty weight implies that the collating element will be assigned a weight equal to the current position in the order In other words the collating element collates as itself 5 5 5 order end Keyword The order end keyword terminates th
119. harmap file refer to charmap 5P The processor assumes that the definition is a generic one intended for use with many codesets Such a generic definition may contain characters not present in all codesets Therefore the colldef processor assumes that the character should simply be ignored and issues a warning message to that effect Note that any escape character or right angle bracket in a symbolic name must be preceded by the escape character Using symbolic names rather than any other notation makes it possible to use the same source definition with several codesets For example lt c gt lt a gt gt lt c cedilla gt 4 gt lt gt lt gt International Supplement Manual for Advanced Users 23 2 Character notation A character is specified by the character itself The quote comma semicolon angle brackets and escape character lt gt and escape character must be escaped preceded by the escape character if they are found outside strings enclosed by double quotes only the double quote must be escaped inside quoted strings For example e3g3a May Octal notation An octal constant must be specified as the escape character followed by two or three octal digits For example 143 347 115 141 171 Hexadecimal notation A hexadecimal constant must be specified as the escape char acter followed by an x followed by one or two hexadecimal digits For example 63 7 x4a x61 x79
120. he first string must be the full name of the first month of the year January the second the full name of the second month and so on For example mon Januar Februar Marz 1 Mai Juni Juli August September Oktober November Dezember 3 3 5 d_t_fmt Keyword This keyword is used to define the appropriate date and time representation corresponding to the date c field descriptor The operand must consist of a string and may contain any combination of characters and date field descriptors In addition the string may contain the date n and t field descriptors for newline and tab characters respectively For example d t fmt Xb Xd XH XM XS 3 3 6 fmt Keyword This keyword is used to define the appropriate date representation corresponding to the date field descriptor The operand must consist of a string and may contain any combination of characters and date field descriptors For example d fmt Xm Xd Xy 3 3 7 t fmt Keyword This keyword is used to define the appropriate time representation corresponding to the date X field descriptor The operand must consist of a string and may contain any combination of characters and date field descriptors For example t fmt H M S 3 3 8 am_pm Keyword This keyword is used to define the appropriate representation of the ante meridiem and post meridiem strings corresponding to the date p field descriptor The operand must co
121. hese locales are located in 1ib locale ISC localename src The default message catalogue location is 1ib locale ISC msgcat The libc cat message catalogue contains the English language version of the error mes sages displayed by the library routines perror 3P and strerror 3P lib locale ISC msgcat src libc msg is the source file it can be translated into other languages which can then be used to generate alternate message catalogues for use by those routines 4 International Supplement Overview and Installation A subset of contributed data files containing additional locales keyboard mapping files and so on is also supplied Some of these files have been contributed by third parties All of these files are supplied as is and are not supported 3 DOCUMENTATION REFERENCES Throughout this guide the following full documentation titles will be referenced in shortened versions as follows Full Title Shortened Version INTERACTIVE UNIX System V 386 INTERACTIVE UNIX Release 3 2 Operating System Guide Operating System Guide INTERACTIVE UNIX System V 386 INTERACTIVE UNIX System Release 3 2 User s System Administrator s User s System Administrator s Reference Manual Reference Manual INTERACTIVE Software INTERACTIVE SDS Guide and Development System Guide and Programmer s Reference Manual Programmer s Reference Manual References of the form name n refer to an entry called name in section of the reference manual or man
122. ib keyboard map INTERACTIVE UNIX System 7 International Supplement ttymap 1 ttymap 1 FILES usr lib keyboard usa map sample mapfile for using compose character sequences and deadkeys on a U S keyboard usr lib keyboard map sample mapfiles for European boards without compose and deadkey ETN sections usr lib keyboard keys dump of default keytable for PC keyboard usr lib keyboard strings dump of default stringtable for PC keyboard SEE ALSO stty 1 keyboard 7 termio 7 in the INTERACTIVE UNIX System User s System Administrator s Reference Manual NOTE TO USERS This entry is reprinted from the INTERACTIVE UNIX System User s System Administrator s Reference Manual INTERACTIVE UNIX System 8 International Supplement catclose catclose 3P NAME catclose close a message catalogue descriptor SYNOPSIS include lt nl_types h gt int catclose catd nl catd catd DESCRIPTION The catclose function closes the message catalogue identified by The file descriptor underlying the message catalogue descriptor will be closed RETURN VALUE Upon successful completion a value of 0 is returned ERRORS No errors are defined SEE ALSO catopen 3P NOTE TO USERS This entry is reprinted from the INTERACTIVE SDS Guide and Programmer s Reference Manual INTERACTIVE UNIX System 1 International Supplement catgets 3P catgets 3P NAME catgets read a program message
123. ich is defined as the subset of the user s environment that depends on language and cultural conventions A locale consists of a number of categories with each category controlling a specific aspect of the international environment Each category is usually referred to by the variable used to set or modify it The International Supplement recognizes the following categories m FUN International Supplement User s Manual 29 e Date and Time Format This category LC TIME affects how date and time are displayed e Character Classification This category CTYPE defines codeset characteristics and character classification e Collation This category LC COLLATE affects the collation sorting order e Numeric and Monetary Formatting These categories LC NUMERIC and LC MONETARY affect the format of nonmonetary and monetary numeric information such as the decimal delimiter e Yes No Responses This category LC MESSAGES affects the strings used to indi cate yes no answers to utility and application queries Note that while the internationalised yes no response is required by XPG3 for certain commands the LC MESSAGES category is not part of the locale as defined by XPG3 e Message Catalogues Message catalogues are not yet covered by the locale categories but use similar mechanisms The locale and the various categories only affect the behaviour of an application if the application is set up to do so This ensu
124. ification Rationale This behaviour is collectively optional that is it should be provided for all commands listed subject to sections 3 1 and 3 2 which iden tify those commands not supplied by the vendor and those which do not fully support the X Open specification Reference XPG3 Volume 1 Pages 4 5 Status of Interfaces Page 3 3 4 Ar X Open Conformance Statement XCS QUE 3 2 Questionnaire 3 3 2 Regular Expressions in Commands Question 6 Which form of regular expression syntax is supported by those commands which use regular expressions Answer Command Regular Expression Syntax Supported awk Extended Internationalised csplit Simple Internationalised ed Simple Internationalised egrep Extended Internationalised ex Simple expr Simple Internationalised grep Simple Internationalised lex Extended pg Simple Internationalised sdb Simple sed Simple Internationalised vi Simple Note An XPG 3 conforming system which claims support for inter nationalised commands should provide the regular expression syntax marked in bold in the above table Where neither options are marked in bold either may be provided Rationale The XPG Volume 3 XSI Supplementary Definitions requires that an internationalised set of commands will provide regular expression syntax for the above commands in one of the forms specified for that command The XPG encourages the implementation of Page 3 3 5 X Open Conformance Statement XCS
125. ile into a formatted message catalogue catfile The file catfile will be created if it does not already exist If catfile does exist its messages will be included in the new catfile If set and message numbers collide the new message text defined in msgfile will replace the old message text currently contained in catfile If the c option is specified on the command line or the existing catfile was generated with the c option the catfile will be confidential that is it will not be translatable into a message text source file by the showcat 1P utility In this implementation gencat makes the following interpretations with respect to the format of a message text source file see gencat 4P for the format of a message text source file as defined in the X Open Por tability Guide Volume 3 XSI Supplementary Definitions Section 5 2 1 Message Text Source Files 1 Set number ordering relates to set numbers from both set and delset directives Thus the following is illegal delset 2 set 1 2 A set or message number can be equal to the preceding one Thus the following is legal delset 2 set 2 3 If any line in a message text source file not just a text string ends with a backslash V that is treated as a line continuation This utility operates in an 8 bit transparent manner ERRORS If there are any errors in the course of processing any msegfile or if it exists catfile gencat will not generate a new cat
126. international environment For example the default decimal delimiter in the U S is a period but in most European countries the comma is used instead which in turn is used in the U S as the thousands separator character So 1 000 which is one thousand dollars in the U S could be inter preted as a single dollar in Europe Misinterpreting things the other way around could be quite an expensive mistake By defining numeric and monetary formatting with the correct values programs display fractions using the appropriate decimal delimiter Applications such as accounting programs often have to be modified to display the correct monetary symbol The manner in which numbers representing amounts of money are formatted is also sub ject to local conventions 8 5 Yes No Responses Some utilities such as rm require the user to acknowledge whether a specific action should be taken The usual response is either yes or no Before internationalisation such utilities required the user to respond using the English y or n Such a response is not natural to French speaking people in the world where of course oui would be more natural instead of yes INTERACTIVE has added 36 International Supplement User s Manual the capability to define the correct yes and no responses for a partic ular locale 8 6 Message Catalogues The message catalogue system specified by XPG3 allows program messages to be stored separately from the logic of the
127. ion file for your machine 6 2 The Console On the console a font of 256 different symbols can be used That font information is stored in Random Access Memory RAM on the video card inside the computer to which the monitor is attached The information can be changed on old or inexpensive systems the information is stored in Read Only Memory ROM and can only be changed by replacing the ROM with a different ROM INTERACTIVE has developed a utility called loadfont to change the font information in the video card This utility has predefined built in fonts However anyone can use it to develop a personalized font Refer to oadfont 1 for more information 6 3 Displaying Data and Using INTERACTIVE X11 INTERACTIVE X11 and X11 based applications always use fonts when text is displayed Most applications have a command line option fn to indicate which font to use Fonts for both the 8859 1 most of the supplied fonts and IBM 437 codesets are sup plied with INTERACTIVE X11 The font files supplied with the International Supplement can also be used with INTERACTIVE X11 after converting them with the bdftosnf utility 28 International Supplement User s Manual 7 THE INTERNATIONAL ENVIRONMENT The internationalisation features discussed thus far have all involved compliance with international standards and the ability to correctly enter store and display the letters used by the local language Some of the other features an intern
128. iptions of the individual characters They consist of the following parts The word STARTCHAR followed by up to 14 characters no blanks describing the character This can either be some thing like C0041 which indicates the hex value of the charac ter or uppercaseA which describes the character The word ENCODING followed by a positive integer represent ing value by which this character is represented internally in the codeset for which this font is used The integer needs to be specified in decimal The word SWIDTH followed by the scalable width in x and y of character Scalable widths are in units of 1 1000th of the size of the character The y value should always be 0 the x value is typically 666 for the type of characters used with loadfont The values are not checked by the loadfont utility but this line needs to be there for compatibility purposes The word DWIDTH followed by two numbers which in a BDF file would mean the width in x and y of the character in device units The y value is always zero The x value is typically 8 loadfont checks only for the presence of the DWIDTH keyword The word BBX followed by the width in x height in y and x and y displacement of the lower left hand corner from the ori gin of the character Most fonts used by video cards will not use the bottom 4 rows of pixels which basically means a vertical y displacement of 4 The only width allowed by loadfont is 8 heights sup ported are 8
129. irst line of this section describes what the compose character is That line should always be present in the mapfile Subsequent lines consist of three character representations indicating each time that the third character needs to be generated on input when the compose char acter is followed by the first two Compose sequences with the same first character should be grouped together For example compose 0x89 e with umlaut is generated when typing x a 0x84 a with umlaut e 0x89 with umlaut is generated when typing x e 0x84 a with umlaut The following example would give the wrong result All lines starting with the same character specification should be grouped together compose C7 0x89 e with umlaut is generated when typing x 7 0x89 e with umlaut is generated when typing x e a 0x84 a with umlaut a 0x84 a with umlaut Output section This section describes the mapping on output either single byte to sin gle byte or single byte to string A string is specified as a series of character specifications For example output 0x82 map with accent to to display e with accent y K P L L print KILL when kill character is used Scancodes section This section will only have an effect when your terminal is a scancode device No error message will be produced when this section is mistak enly in your mapfi
130. l The international currency symbol applicable to the current locale The first three characters contain the alphabetic inter national currency symbol in accordance with those specified in ISO 4217 Codes for the Representation of Currency and Funds The fourth character immediately preceding the null character is the character used to separate the international currency symbol from the monetary quantity char currency symbol The local currency symbol applicable to the current locale char mon decimal point The decimal point used to format monetary quantities char mon thousands sep The separator for groups of digits before the decimal point in formatted monetary quantities char mon grouping A string whose elements indicate the size of each group of digits in formatted monetary quantities char positive sign The string used to indicate a nonnegative valued formatted monetary quantity INTERACTIVE UNIX System 1 International Supplement localeconv 3P localeconv 3P char negative sign The string used to indicate a negative valued formatted mone tary quantity char int frac digits The number of fractional digits those after the decimal point to be displayed in an internationally formatted monetary quantity char frac digits The number of fractional digits those after the decimal point to be displayed in a formatted monetary quantity char p cs precedes Set to 1 or 0 if the currency symbol respec
131. l Supplement User s Manual 7 4 ENTERING DATA The UNIX System is an interactive multi user time sharing operat ing system which means that several computer users interact with the computer at the same time usually by typing on a keyboard This input as well as the result of the computations done by the application used is displayed on the computer screen as output The device used to interact with the computer is either a self contained unit with a keyboard and a screen that is connected to a serial port of the computer a terminal or a directly connected keyboard and a monitor attached to the computer s video card usu ally referred to as the console Input consists of keystrokes that typically represent letters and other symbols which are pictured on the keys of the keyboard A com puter however speaks no particular language and has no notion of what a letter is Instead a letter is stored in a computer either in its memory or in a file on the fixed disk as a number Unless every computer system uses the same number to store a certain letter much confusion is created when attempting to transfer data from one type of machine to another For that reason conventions and standards for storing characters into a computer have been created For more information about this refer to section 5 STORING DATA IN THE COMPUTER Most keyboards today have 101 or 102 keys These keys can be divided into three groups e The centra
132. l information regarding the underlying architecture of the implementation Reference XPG3 Volume 2 Page 537 lt limits h gt 2 1 4 Error Conditions Question 5 Which of the following optional errors listed in the XPG are detected in the circumstances specified Answer Function Error Detected g access EINVALT No ETXTBSY Yes atof ERANGE Yes atoi ERANGE No atol ERANGE No cfsetispeed EINVAL No cfsetospeed EINVAL No chmod EINVAL No chown EINVALT No Page 2 1 6 Function closedir exec fentl fdopen feof ferror fileno fopen freopen fork fseek ftw getcwd isatty open opendir pathconf X Open Conformance Statement Questionnaire XCS QUE 3 2 Error Detected EBADFt Yes Yes ETXTBSY Yes EDEADLKt Yes EBADF No EINVAL No EBADF No EBADF No EBADF No EINVAL No ETXTBSY Yes EINVAL No ETXTBSY Yes ENOMEM Yes EINVAL Yes EINVAL No EACCESt Yes EBADF No ENOTTY No EINVAL Yes ETXTBSY Yes EMFILET Yes ENFILEt Yes EINVALT No ENAMETOOLONGt No Page 2 1 7 X Open Conformance Statement XCS QUE 3 2 Questionnaire Function Error Detected ENOTDIRT No fpathconf EBADFf No EINVALT Yes printf EINVAL Yes readdir EBADFT Yes rename ETXTBSY No scanf EINVAL Yes setvbuf EBADF No sigaddset EINVALT Yes sigdelset EINVALT Yes sigismember EINVALT Yes strcoll EINVAL No strerror EINVAL Yes
133. l section of the keyboard e The numeric keypad e The function keys The central section of the keyboard contains keys used to type regu lar letters and punctuation characters such as the period and semicolon The layout of this section of the keyboard differs from country to country The numeric keypad is a section of the keyboard that is designed for easy and fast access to all the numeric characters 0 9 and sym bols indicating operators such as plus and the asterisk It is often compared to the keys on a calculator This set of keys can be used in two modes In the first they generate the numerals and sym bols pictured on the keycaps in the second they act as special func keys and cursor movement keys The mode in effect is 8 International Supplement User s Manual indicated by the NUMLOCK light and can be changed by using the NUMLOCK key When the NUMLOCK light is on the keys gen erate the numerals and symbols on the keycaps The layout of the function key section of the keyboard depends on the manufacturer but today most computer keyboards are relatively standard They usually contain 10 or 12 function keys on the top row of the keyboard labeled to eed These keys gen erate sequences of characters such as escape the code gen erated by the escape key o p often called escape sequences Applications can take advantage of these keys by determining the actual escape sequence generated by a fun
134. le because the program will find out whether the terminal is a scancode device or not The lines in this sec tion can have two different formats One format will be used to describe what the values of the function keys must be The other for mat describes the translation of scancodes into a byte or an escape sequence No specific order is required Function keys Here is an example of a line defining a string for a function key F13 d a t e amp An SHIFT Fl is the date command The numbering convention of the functionkeys is described in a ous section Currently the use of quoted strings such as dateW is not supported Scancodes Specifying how to translate a scancode is a more complex task The general format of such a line is scancode normal shift alt shiftalt flags INTERACTIVE UNIX System 6 International Supplement ttymap 1 ttymap 1 scancode should list the hexadecimal representation of a scancode gen erated by a key unquoted How keys correspond with scancodes can be found in keyboard 7 normal shift alt and shiftalt are character representations in one of the formats described throughout this document optionally followed by one of the following special keywords IC This indicates that the key is influenced by the CTRL key IN This indicates that ESC N should preceed the specified character IO This indicates that ESC should preceed the specified character I This indicates that ESC
135. le from a binary mes sage catalogue the opposite of gencat 1P If the binary file is confidential ie it was generated by gencat c no attempt is made to translate it to source and a corresponding message is printed If the binary file is not confidential but is not in the proper format i e it is corrupted then the source file will not be generated The generated source file uses quoting with the double quote as the quote character For the message text printable characters in the locale are written as is in the source file For the other characters if there is a defined escape sequence that is written otherwise an octal bit pattern is written EXAMPLE The following is an example of the source file format generated by showcat quote set 1 1 This is set 1 message 1 2 This is set 1 message 2 3 This is set 1 message 3 It is continued where there was aW newline character in the input set 3 This is set 3 message 1 3 This is set 3 message 3 5 This is set 3 message 5 The following within single quotes is n the representation of the character with value 200 octal n when showcat is run in the C locale 200 SEE ALSO gencat 1P INTERACTIVE UNIX System 1 International Supplement ttymap 1 ttymap 1 NAME ttymap set terminal mapping and scancode translation SYNOPSIS ttymap mapfile ttymap r ttymap d DESCRIPTION ttymap is a utility that permits a user
136. le types may be executed Reference XPG3 Volume 2 Page 129 exec Page 2 2 1 X Open Conformance Statement XCS QUE 3 2 Questionnaire 2 2 2 Process Termination Question 10 Js the SIGCHLD signal sent to the parent process when a child exits Answer Yes Rationale Some systems support the sending of SIGCHLD in these cir cumstances This is mandatory if job control is supported Reference XPG3 Volume 2 Page 132 exit 2 2 3 Process Environment Question 11 Js the setpgid interface provided Answer Yes Rationale This interface is mandatory on systems which support job control and may be provided on other systems Reference XPG3 Volume 2 Page 3 Status of Interfaces Page 2 2 2 Conformance Statement XCS QUE 3 2 Questionnaire Section 2 3 File Handling 2 3 1 Access Control Question 12 What file access control mechanisms does the implementa tion provide Answer Standard access control is provided Options 1 Standard access control is provided 2 Refer to POSIX 1 Conformance Document Section 2 4 3 Provide a definition of the additional or alternate access mechanisms Rationale The XPG and POSIX allow an implementation to provide either additional or alternate file access control mechanisms other than the standard access control mechanism The document should either describe or provide a reference to the details of alternate or addi tional access mechanisms In
137. locales all others should be considered private Installation procedures are the same for both private and public locales Only the system administrator should be able to create modify or delete public locales 1 As a first step create a directory with the desired name of the locale within lib locale ISC or in case of a private locale the appropriate directory Then the individual categories should be created as described in the following sections LC COLLATE The information in the LC COLLATE file is generated via the colldef utility For details see the utility description LC_CTYPE The information in the LC_CTYPE file is generated via the chrtb util ity After executing the chrtbl utility the generated data file must be copied or moved to the locale directory and given the name of LC_CTYPE As an example assuming that the name or the desired locale is fr FR 8859 and the chrclass value the character classification table is french then the following steps should be performed Schrtbl sourcename cp french lib locale ISC fr FR 8859 LC CTYPE INTERACTIVE UNIX System 2 International Supplement locale SP locale 5P LC MESSAGES The information in the LC MESSAGES file is in text format and defines the strings associated with the affirmative and negative responses used by selected utilities Each line in the text file contains a keyword and a value separated by space s or tab
138. map file The following example converts the contents of the file mail x400 from codeset ISO 6937 1983 to ISO 8859 1 1987 and stores the results in the file mail local iconv f 6937 cmap t 8859 cmap mail x400 gt mail local 8859 is used as a synonym for 8859 1 both in the built in table 8859 and in the charmap file lib charmap 8859 cmap SEE ALSO charmap 5P INTERACTIVE UNIX System 2 International Supplement loadfont 1 NAME loadfont 1 loadfont list or change font information in the RAM of the video card SYNOPSIS loadfont loadfont f filename loadfont codepage loadfont loadfont d loadfont m mode DESCRIPTION The oadfont utility allows a user to load and activate a different font into the RAM of the video card used by the console of the INTER ACTIVE UNIX Operating System It can also be used to display infor mation about the font currently in use In addition the m option can be used to change the size of the characters on the screen it can also be used to change the number of lines or colors e g to run an applica tion at the console at 43 lines at a time instead of 25 loadfont will always read from standard output this will allow a system administra tor to use it from a remote terminal Options loadfont When used without arguments oadfont displays the different ways the command can be used as shown in the synopsis loadfont f filename This co
139. mentation defined Text strings can contain the special characters and escape sequences defined in the following table Description Symbol Sequence new line character n horizontal tab HT t vertical tab VT backspace carriage return form feed backslash bit pattern The escape sequence ddd consists of a backslash followed by one two or three octal digits which are taken to specify the value of the desired character If the character following a backslash is not one of those specified the backslash is ignored A backslash followed by an ASCII new line character is also used to continue a string on the following line Thus the following two lines describe a single message string 1 This line continues to the next line which is equivalent to 1 This line continues to the next line SEE ALSO gencat 1P NOTE TO USERS This entry is reprinted from the INTERACTIVE SDS Guide and Programmer s Reference Manual INTERACTIVE UNIX System 2 International Supplement loadfont 4 loadfont 4 NAME loadfont format of a loadfont input file DESCRIPTION File This section describes the format of files that can be used to change the font used by the console when using the oadfont utility with the f option The format is compatible with the Binary Distribution Format version 2 1 as developed by Adobe Systems Inc however certain restrictions apply Video cards when used with the INTERACTIVE
140. mmand reads the contents of filename and subse quently loads the font specified in the file into the RAM of the video card If the file does not have the correct format an error message is produced loadfont codepage If codepage is the name of a hard coded font available for the current font size this font will be loaded into the RAM of the video card and activated Available font names are listed when the 1 option is used If the codepage argument specified is not the name of a valid font an error message will be produced loadfont This option displays a short description of the fonts that are hard coded into the program and the name that can be passed as Only fonts that match the current font size are listed oadfont 1 also displays the different charac ter modes supported by and the exact name that should be used with the m option Here is a sample output INTERACTIVE UNIX System 1 International Supplement loadfont 1 loadfont 1 Codepages supported for this size font are Name Description 437 IBM 437 codepage 8859 ISO 8859 1 codeset 8859g ISO 8859 1 with graphics 850 IBM 850 codepage Different possible text modes supported are Name Description E80x43 80 columns 43 lines E40x25 40 columns 25 lines E80x25 80 columns 25 lines V40x25 40 columns 25 lines V80x25 80 columns 25 lines 8859g means the 8859 1 codeset with box d
141. n 8 bit coding scheme referred to as IBM extended ASCII has been used for several years This codeset is currently referred to as IBM codepage 437 In heterogeneous UNIX System environments a different codeset called ISO 8859 1 has been pro moted Both of these codesets are supersets of ASCII Although an 8 bit system meets most of the European requirements for the major Asian Languages a 16 bit system is necessary even to support a single language it should function properly in conjunc tion with the available hardware and in particular with the termi nals To use characters from the French German Finnish and other alphabets several terminals are available that generate 7 bit codes but display the characters from those alphabets on the screen instead of the ones found on a U S terminal Their keyboards have the same number of keys but different characters are pictured on the key caps Others like the DEC VT220 support 256 characters at a time but use their own proprietary codeset and have an extra COMPOSE key To illustrate the problems that occur when trying to use such termi nals in a mixed language environment imagine an INTERACTIVE UNIX System with a console and a French 7 bit terminal connected to the serial port When editing a file on the terminal and using the French character in text the terminal hardware actually gen erates the ASCII code 123 which is the code normally used for the International Suppleme
142. n access to a different method of displaying the date typically in a different language but the feature can also be used if you want to call Saturday Partyday instead for instance The date and time information needs to be stored in a text file The fol lowing information is required Abbreviated month names in order Month names in order Abbreviated weekday names in order Weekday names in order Default strings that specify formats for local time and date e Strings used to replace AM and PM This file must be stored in the directory 1ib cftime When the shell variable LANGUAGE is set to the name of the file the date and time are displayed accordingly Note that the X Open mechanism uses LC TIME instead 9 2 Character Classification UNIX System V Releases 3 1 and later also supports character classification A utility chr tb1 converts a text file that contains a 38 International Supplement User s Manual description of the codeset into a binary file When that file is installed in 1ib chrclass and the shell variable CHRCLASS is set to the name of that file the correct character classification is used The format of that file is described in chrtb 1M Note that the X Open mechanism uses LC CTYPE instead Although the use of the X Open announcement mechanism is recommended the System V method should be used for System V utilities and applications such as vi which were not international ised for XPG3
143. nce class Secondary ordering and tertiary ordering are defined using the characters themselves The uppercase letters collate before the lowercase ones and the accented letters after the unaccented ones The two strings Ga and Ca first compare as CA versus CA Based on secondary weights they still compare as equals C lt LOWER_CASE gt versus C lt LOWER_CASE gt On ter tiary weight comparison the two strings compare as G lt LOWER_CASE gt versus C lt LOWER_CASE gt that is the second compares lower The string ch compares as a single element The string Bach consists of three collating elements and collates after the string Back The character 8 eszet or sharp s is a German character that collates as two esses ss This means that the two strings Strasse and Strafe should collate as equals All characters not explicitly defined or implicitly included via an ellipsis are placed last in the collation sequence in order according to their coded values They are ignored for colla tion purposes 32 International Supplement Manual for Advanced Users 5 5 7 Use in Regular Expressions and Shell Pattern Matching The collation sequence determines how bracket expressions in regu lar expressions are interpreted l characters are valid in a bracket expression Multicharac ter collating elements such as lt ch gt in the example above are also recognised Multicharacter collating elements must be entered usi
144. nd see a Q on his screen Several status keys can influence the translated code as well The keyboard driver and thus the ttymap program makes a distinction between two sets of key combinations that can be translated Function keys Up to 60 key combinations are recognised as function keys The first 12 are the 12 function keys of a 101 key PC keyboard the first 10 on an 84 key keyboard If you do not know whether you have an 84 or 101 key keyboard you can use the following scheme to determine which type you have If your keyboard has arrow keys that are separate from the ones on the numeric keypad then you have a 101 keyboard If the arrow keys on your keyboard are located on the numeric keypad only then you have an 84 key keyboard F13 to F24 are the same keys used in combination with SHIFT F25 to F36 when used with CTRL and F37 to F48 when used with CTRL and SHIFT together F49 to F60 are the keys on the numeric keypad in the following order INS Each of these function keys can be given a string as a value The total length of all strings should not exceed 512 characters See keyboard 7 for a list of default values Regular keys Scancodes generated by all keys on the PC keyboard can be translated in a different way as well For each key a different translation can be specified for each of the following four cases 1 The key is pressed 2 The key and the SHIFT key are pressed simultaneously 3 The key
145. nes INTERACTIVE supplies tt ymap files for the console to support all major keyboard types These files are delivered with the INTER ACTIVE UNIX Operating System in the usr lib keyboard directory and are named A number of other ttymap files and font files which have names with the suffix baf for example vga855 bdf some of which have been supplied to INTER ACTIVE by third parties are distributed with the International Sup plement on an as is basis The ttymap files include Language Codesets Territory 863 865 866 8859 1 X lt X X X X X X X X X X tal tal These files are located in directories under the usr lib keyboard directory that represent the codeset 437 850 863 and so on and are named for the anguage territory de DE for example In many cases the experienced user or the system administrator needs to create or modify an existing mapfile to support a specific terminal or environment The following categories deter mine how the mapping should be configured International Supplement Manual for Advanced Users 9 e The type of terminal used e The codeset used e The layout of the keyboard used e The country it is used in or the language spoken by the user Each time one of these categories changes a different t tymap file is required 10 International Supplement Manual for Advanced Users 3 SPECIF
146. ng a special bracket dot syntax for example ch to distin guish the multicharacter element from the sequence cn characters belonging to an equivalence class can be refer enced using the special bracket equal syntax a is shorthand for A in the example above Range expressions are interpreted according to the basic char acter collation order that is the order in which the characters are listed in the definition In the previous example all char acters not explicitly specified collate last via the UNDEFINED statement This means that using the previous example a s only specifies the characters in the list between a and S ch 5 Likewise a range such as r t will not contain s To be able to find both Strasse and Strafe in text with one expression it is necessary to make ss into a collat ing element Then the following regular expression will find both strings Stra ss 8 e International Supplement Manual for Advanced Users 33 6 SPECIFYING NUMERIC AND MONETARY INFORMATION Numeric and monetary formatting determines how numeric and monetary items appear This section explains how it can be used and how the files that contain the information should be set up 6 1 Reasons for Defining Numeric and Monetary Formatting The default conventions for decimal delimiter and other numeric formatting rules are seldom appropriate in an international environ ment
147. ngs of characters When a keyword is followed by more than one operand the 22 International Supplement Manual for Advanced Users operands must be separated by semicolons blanks are allowed before and or after a semicolon A line modifying the comment character the default is can be inserted before the header The format is comment char new comment character starting in the first column Empty lines and lines containing the new comment character in the first position are ignored A line modifying the escape character the default is a backslash V can also be inserted before the header The format is escape c har escape character starting in the first column A line can be continued by placing an escape character as the last character on the line Comment lines cannot be continued on a subsequent line using an escaped newline character Individual characters characters in strings or collating elements can be represented in operands in any of the following formats 1 Symbolic notation A character is specified via a symbolic character name enclosed within angle brackets lt gt A symbolic name including the angle brackets must either be a symbol defined via a collating symbol or collating element keyword or must exactly match a symbolic name defined in the charmap file specified via the colldef f option It is not an error to specify a collating element via a charmap symbol that does not exist in the current c
148. nsist of two strings separated by a semicolon The first string must represent the ante meridiem designation the last string the post meridiem designation For example 14 International Supplement Manual for Advanced Users am pm AM 3 3 9 t_fmt_ampm Keyword This keyword is used to define the appropriate time representation in the 12 hour clock format with am_pm corresponding to the date r field descriptor The operand must consist of a string and may contain any combination of characters and date field descrip tors If this keyword is not defined the default I M S p is used For example t fmt ampm XI XM XS Xp 3 3 10 A Sample File LC TIME abday Die Mit Sam Sonntag Montag Dienstag V Mittwoch Donnerstag Freitag Samstag abmon Jan Feb M rz Apr Mai Juni Juli Aug Sept Okt Nov Dez mon Januar Februar Marz April Mai Juni Juli August V September Oktober November Dezember d t fmt 5 Xp X m Xd Xy d fmt Xm Xd Xy t fmt XI XM XS Xp am pm NM t_fmt_ampm XI XM XS Xp END LC TIME 3 3 11 How a Program Uses This Information If a program needs to access the values in the current locale it can do so via the library subroutine n1 langinfo as well as by using the definition via the strftime library subroutine refer to ctime 3P
149. nt loadfont 4 loadfont 4 FILES usr lib loadfont vga437 bdf SEE ALSO loadfont 1 NOTE TO USERS This entry is reprinted from the INTERACTIVE SDS Guide and an Programmer s Reference Manual E INTERACTIVE UNIX System 4 International Supplement 5 charmap SP NAME charmap character set description file lib charmap w DESCRIPTION The INTERACTIVE UNIX System supports single byte coded charac ter sets that are supersets of the ASCII coded character set Examples of such coded character sets are IBM codepage 437 This is the familiar IBM PC codeset which is the default codeset in the INTERACTIVE UNIX System IBM codepage 850 This is the IBM International codepage ISO IEC 8859 1 This is an international standard coded character set also known as Latin Alphabet No 1 which covers Western European languages Note that the 7 bit ASCII codeset must be contained within each of these codesets The charmap files are used to define and document the supported coded character sets primarily for use in the colldef 1P and iconv 1P utilities Each character in the coded character set is described with a symbolic name and the character encoding The INTERACTIVE UNIX System provides charmap files for the above coded character sets as well as a for ASCII Users may add charmap files provided that the following rules are followed 1 new charmap must contain
150. nt Manual for Advanced Users 3 left curly brace This example assumes that the terminal uses the French national variant of ASCII called ISO 646f If the file that was edited is looked at on the console the letter actually appears to be a curly brace Therefore input and output mapping should be supported by the tty subsystem to allow consistent use of one single codeset throughout the system Implementing character mapping support inside the tty subsystem has the advantage that its features are automatically supported by all peripherals that use the standard line discipline without modify ing the device drivers for these peripherals 2 2 Mapping Features For each tty device character mapping can be done on input as well as on output The information is stored in a buffer the size of which should not exceed 1K The following mapping features are supported e Input mapping On input any byte can be mapped to any byte Using the exam ple from the previous section 123 could be mapped to 130 the code used for in the IBM extended ASCII codeset or C9 its equivalent in the ISO 8859 1 codeset e Output mapping On output any byte can be mapped to either a byte or a string In the previous example 130 or C9 would be mapped back to 123 to properly display the character on the screen If the con nected device is a printer that does not support the character it can be mapped into the string e BACKSPACE e Deadkeys Ce
151. nternational Supplement Manual for Advanced Users 1 The character used as a decimal delimiter 2 The character used to separate groups of digits thousands separator 3 The size of such groups It should be noted that while the standard INTERACTIVE UNIX System library subroutines printf scanf and strtod refer to printf 3P scanf 3P and strtod 3C for more information are sensitive to the decimal delimiter they do not support grouping of digits Consequently while user developed functions can and should take into account grouping and thousands separators the standard functions do not 6 5 Creating a Numeric Category Definition The source language for the numeric category in the INTER ACTIVE UNIX System is the language defined by the POSIX 2 group for the LC NUMERIC locale category A numeric editing source definition consists of a header a numeric editing body and a trailer The header is the word LC NUMERIC The trailer is the string END LC NUMERIC The numeric editing body consists of one or more lines of text Each line contains a keyword followed by one or more operands Keywords are separated from the operands by one or more blank characters space or tab Operands are characters strings of characters or digits When a keyword is followed by more than one operand the operands must be separated by semicolons Blank characters are allowed before and or after a semicolon Strings must be surrounded by q
152. nth of the year e g October MON 11 Name of the eleventh month of the year eg November MON 12 Name of the twelfth month of the year e g December ABMON_1 Abbreviated name of the first month of the year ABMON 2 Abbreviated name of the second month of the year ABMON 3 Abbreviated name of the third month of the year ABMON 4 Abbreviated name of the fourth month of the year ABMON 5 Abbreviated name of the fifth month of the year ABMON 6 Abbreviated name of the sixth month of the year ABMON 7 Abbreviated name of the seventh month of the year ABMON 8 Abbreviated name of the eighth month of the year ABMON 9 Abbreviated name of the ninth month of the year ABMON 10 Abbreviated name of the tenth month of the year ABMON 11 Abbreviated name of the eleventh month of the year ABMON 12 Abbreviated name of the twelfth month of the year RADIXCHAR Decimal delimiter THOUSEP Thousands separator YESSTR Affirmative response for yes no Note that this is returned as an uncompiled regular expression NOSTR Negative response for yes no Note that this is returned as an uncompiled regular expression CRNCYSTR Currency symbol preceded by if the symbol should appear before the value by if the symbol should follow the value or by if the symbol should replace the decimal delimiter SEE ALSO nl langinfo 3P NOTE TO USERS This entry is reprinted from the INTERACTIVE SDS Guide and Programmer
153. o that locale for all categories regardless of whether any of the other variables set Example LC_ALL fr_FR 437 This environment variable defines the desired environment for LC COLLATE category LC COLLATE fr CA 863 for example This environment variable defines the desired environment for the LC CTYPE category Example LC CTYPE C This environment variable defines the desired environment for the LC MESSAGES category LC MESSAGES de DE 850 for example This environment variable defines the desired environment for the MONETARY category Example LC MONETARY es ES 8859 1 This environment variable defines the desired environment for the LC NUMERIC category LC_NUMERIC da_DK 865 for example This environment variable defines the desired environment for the LC_TIME category Example LC_TIME en_UK 437 If this environment variable is set the specified value is used for all categories not explicitly set in other words it is the International Supplement User s Manual 31 fallback unless LC ALL is also set The LANG variable is also used to locate a specific message catalogue Example LANG en US 32 International Supplement User s Manual 8 INTERNATIONALISED BEHAVIOUR This section explains how the international environment affects the behaviour of system utilities and applications 8 1 Date and Time Format e The default conventions for the date and time format as well as the names
154. oes not exist SEE ALSO catclose 3P catgets 3P environ 5P in the INTERACTIVE SDS Guide and Programmer s Reference Manual NOTE TO USERS This entry is reprinted from the INTERACTIVE SDS Guide and Programmer s Reference Manual INTERACTIVE UNIX System 2 International Supplement gw localeconv 3P localeconv 3P NAME localeconv numeric formatting convention inquiry SYNOPSIS include lt locale h gt struct Iconv localeconv void DESCRIPTION The localeconv function sets the components of an object with type struct lconv with values appropriate for the formatting of numeric quantities monetary and otherwise according to the rules of the current locale The members of the structure with type char are pointers to strings any of which except decimal point can point to to indicate that the value is not available in the current locale or is of zero length The members with type char are non negative numbers any of which can be CHAR MAX to indicate that the value is not available in the current locale The members include the following char decimal point The decimal point character used to format non monetary quantities char thousands sep The character used to separate groups of digits before the decimal point character in formatted non monetary quantities char grouping A string whose elements indicate the size of each group of digits in formatted non monetary quantities char int_curr_symbo
155. of the following new default char gt dnnn xnn nnn The first specification which must be a valid charmap symbol from the file defined as the to file is only valid if charmap files rather than the built in tables are specified The latter three formats can only be used with the built in tables and specify the code value of the new default character When the charac ter following the is d then nnn is a decimal value e g 43 for the plus sign When the character following the is x then is a hex adecimal value e g 2B for the plus sign When the character follow ing the is numeric then nnn is an octal value e g 53 for the plus sign EXAMPLES 1 The following example uses the built in tables to convert from the ISO IEC 8859 1 codeset to the IBM codepage 437 codeset and uses the plus character as the default output character iconv f 8859 t 437 S d43 file INTERACTIVE UNIX System 1 International Supplement iconv 1P 2 NOTE iconv 1P In the following example both the fromcode file 8859 4 cmap and the tocode file 865 cmap must exist in the directory lib charmap iconv f 8859 4 cmap t 865 cmap S lt plus sign gt infile gt outfile In the following example the fromcode file is located in the current directory The tocode file being utilized is in the mydir subdirectory of the current directory iconv f 8859 5 cmap t mydir 866 c
156. of the UNDEFINED symbol and ordered accord ing to their coded character set values If no UNDEFINED symbol is specified and the current coded character set contains characters not specified in this clause colldef issues a warning message and places such characters at the end of the character collation order The optional operands for each collating element are used to define the primary secondary or subsequent weights for the collating element The first operand specifies the relative primary weight the second the relative secondary weight and so on Two or more collating elements can be assigned the same weight They are said to belong to the same equivalence class In string collation each pair of strings is first compared based on pri mary weight If equal collating elements belonging to pri mary equivalence classes are compared again based on their secon dary weights If stil equal secondary equivalence class elements are compared again based on tertiary weights up to the limit conn wEIGHTS MAX Weights must be expressed as characters in any of the forms specified above collating symbols collating International Supplement Manual for Advanced Users 29 elements an ellipsis or the special symbol IGNORE single character a collating symbol symbol or a collating element symbol represents the relative order in the character col lating sequence of the character or symbol rather than its absolute value Multiple c
157. of the days of the week and months follow U S conventions and are rarely applicable in other countries By defining and using the date and time environment the dates and times displayed by the system utilities and applications follow the local conventions and use the names of the days and months in the correct language The following aspects of formatting are supported by the INTER ACTIVE UNIX Operating System e Format of time display e Format of date display e Format of combined date and time display e Format of 12 hour time display e Names of days of the week N Abbreviated names of days of the week e Names of the months e Abbreviated names of the months e Format of the ante meridiem and post meridiem strings used in 12 hour clock time displays For example In a French environment the output of date could be Mardi 30 juillet 1991 11 07 35 PDT and the output of ls 1 total 636 rw r r 1 paul other 27399 janv 24 18 36 02 ch01 rw r r 1 paul other 13842 juil 9 18 36 03 ch02 rw r r 1 paul other 9057 mai 12 18 36 03 ch03 rw r r 1 paul other 263 mai 12 15 44 45 document rw r r 1 paul other 398 sept 24 12 37 34 Makefile TS rWXxr xr x 1 paul other 24202 avril 10 1991 show International Supplement User s Manual 33 8 2 Character Classification Regardless of how it is encoded a character has certain features For example it is either printable or nonprintable If a different
158. or U S dollar currency symbol Defines the character to be used as the currency sym bol for example mon decimal point Defines the decimal delimiter for monetary quantities mon thousands sep Defines the thousands separator for monetary quantities mon grouping Defines the grouping of digits positive sign Defines the positive sign negative sign Defines the negative sign int frac digits Defines the number of fractional digits displayed when formatting using the int curr symbol 38 International Supplement Manual for Advanced Users frac digits Defines the number of fractional digits displayed when formatting using the currency symbol p cs precedes Defines whether the currency symbol succeeds or precedes a positive quantity p sep by space Defines whether a space separates the currency symbol from a positive quantity n cs precedes Defines whether the currency symbol succeeds or precedes a negative quantity n sep by space Defines whether a space separates the currency symbol from a negative quantity p sign posn Defines the placement of the sign and a positive quantity n sign posn Defines the placement of the sign and a negative quantity 6 7 1 int curr symbol Keyword This keyword is used to define the international currency symbol The operand must be a four character string with the first three characters containing the alphabetic international currency symbol in accordance with those
159. particular the method by which an application can execute using standard file access control should be explained and details of the changes required to utilised the alter nate or additional access mechanisms should be given Reference XPG3 Volume 2 page 16 File Access Permissions Page 2 3 1 X Open Conformance Statement XCS QUE 3 2 Questionnaire 2 3 2 Files and Directories Question 13 Are any extended security controls implemented that could cause fstat or stat to fail Answer No Rationale The XPG notes that there could be an interaction between extended security controls and the success of fstat and stat This would suggest that an implementation can allow access to a file but not allow the process to gain information about the status of the file Reference XPG3 Volume 2 Page 478 tempnam 2 3 3 Formatting Interfaces Question 14 Js the L modifier to printf and scanf supported on this implementation Answer No Rationale The XPG notes that the L modifier which is exactly equivalent to the 1 modifier when the implementation does not differentiate between double and long double is not supported on all systems and is only included for compatibility with ANSI C Reference XPG3 Volume 2 Page 328 printf XPG3 Volume 2 Page 362 scanf Page 2 3 2 X Open Conformance Statement XCS QUE 3 2 Questionnaire Question 15 Does the printf function produce character string representations fo
160. perating System Guide and for more technical details refer to the manual entry keyboard 7 International Supplement User s Manual 9 The layout of the keyboard is not randomly chosen but is basically the same as on most typewriters The layout is often referred to as QWERTY after the order of the first five letters on the top row of keys containing letters By using the same layout on all typewriters and terminal keyboards computer users can type in text at a very high speed regardless of the equipment they are using Although one might expect that the layout was chosen to give the easiest access to the most frequently used characters this is not the case The QWERTY keyboard layout was originally designed to be slow enough so that mechanical typesetting machine operators would not be able to type fast enough to jam their machines Another keyboard layout called DVORAK places the most common letters in the English language on the home row of keys but this layout is not in common use 4 2 Generating Characters Not Present on a U S Keyboard Although non English characters like the German or the French are not present on a keyboard designed for use in American English most of these characters can be generated This allows non Americans to write French letters on American systems for exam ple There are three ways to generate characters for which there are no keycaps explicit symbols on the keyboard e Deadkeys e Compose
161. pment utilities behave in the manner specified for each of the options detailed in the XPG 2 A list of deviances for each of the utilities is provided This list should be in a tabular form giving the name of the utili ties the option and a description of the deviant behaviour Rationale This question provides a greater degree of granularity than the pre vious question requiring the semantic differences associated with the development utilities to be specified Page 3 2 2 X Open Conformance Statement XCS QUE 3 2 Questionnaire Section 3 3 Internationalisation Option 3 3 1 Commands and Utilities Question 5 Is an internationalised environment reflecting changes in the locale setting as described in XPG Volume 1 XSI Commands and Utilities supported Answer Except for mailx the commands listed below support Internationali sation in the manner specified in XPG3 Options 1 The commands listed below support Internationalisation in the manner specified in XPG3 2 A list of deviations in the Internationalised behaviour of the following commands compared to that specified in XPG3 is provided Command Behaviour Specified in XPG3 Supported ar LC TIME affects date format Yes awk LC COLLATE LC CTYPE affect regular Yes expression matching LC_COLLATE affects the behaviour of Yes string comparisons LC NUMERIC affects the behaviour of the Yes radix character As per POSIX 1 awk only recognizes the period
162. program to be translated into different languages and to be retrieved at run time according to the language requirements of the user This means that a single application a single UNIX System executable can support many languages The program can be translated without requiring access to the C source code of the application all that is needed is a message catalogue source file in one language which can be used to translate it in to other languages For performance reasons two different message catalogue formats are used e A message text source file e A message catalogue used by the application and produced from the message text source using a new utility called gencat refer to gencat 1P INTERACTIVE has also added a utility showcat that can be used to translate the contents of a mes sage catalogue into its message text source that is the opposite of the gencat utility unless an option to prevent this transla tion was used when gencat was used to create the message catalogue Refer to showcat 1P for more information 8 7 The X Open Environment The set of internationalisation features described previously func tions according to the X Open Portability Guide and far exceeds those supported by UNIX System V Every application developed using the INTERACTIVE Software Development System and compiled with the Xp option has access to this functionality The International Supplement provides the ability to create and us
163. r Infinity and NaN to represent the respective special double precision values Answer Yes Rationale This behaviour is often provided on systems with mathematical functions that produce these results Reference XPG3 Volume 2 Page 331 printf Page 2 3 3 X Open Conformance Statement XCS QUE 3 2 Questionnaire Section 2 4 General Terminal Interface 2 4 1 Interfaces Supported Question 16 Are the following terminal control interfaces provided tcgetpgrp tcsetpgrp Answer Yes Rationale These interfaces are mandatory for implementations that support job control Implementations that do not support job control may either always return the error indication ENOSYS or may provide the interface with the behaviour specified for an implementation that supports job control This later case is useful for implementa tions which support only part of the job control specifications Reference XPG3 Volume 2 Page 471 tcgetpgrp XPG3 Volume 2 Page 475 tcsetpgrp Page 2 4 1 X Open Conformance Statement XCS QUE 3 2 Questionnaire Section 2 5 Internationalised System Interfaces 2 5 1 Codesets Question 17 Does the implementation support the ISO 8859 1 1987 codeset for data transmission Answer Yes Rationale The XPG defines the ISO 8859 1 1987 as the major Western Euro pean transmission codeset and also recommends its use as the corresponding internal codeset Reference XPG3 Volume 3 Page 19
164. rawing characters in column 9 of the table characters 0x90 to 0x9a loadfont d This reads the font information from the video RAM and writes it to standard output in a format compatible with the Binary Distribution Format version 2 1 as developed by Adobe Systems Inc loadfont m mode This will attempt to change the mode of the console as specified This will result in having a different font size and or different number of lines and columns on the screen The mode that can be specified should be one of the choices listed above in the oadfont l output If an invalid argument is specified an error message is produced Fonts A font is the representation of characters by images The need to use different fonts can be imposed by l The codeset used to represent the characters internally 2 The resolution used to display the characters Each font contains exactly 256 images All fonts supported are fixed size constant width and constant height i e each character takes the same amount of space on the screen When the monitor is not being used in graphics mode the oadfont utility allows a user to modify the font used by the video card so different images are displayed on the screen of the console for the various characters Depending on the type of video card used different text modes can be supported by the same video card They typically differ by the number of pixels used to represent a single character For each charac
165. rency symbol The sign string precedes the quantity and currency symbol The sign string succeeds the quantity and currency symbol The sign string immediately precedes the currency symbol ye 4 The sign string immediately succeeds the currency symbol RETURN VALUES The ocaleconv function returns a pointer to the filled in object The structure pointed to by the return value shall not be modified by the program but may be overwritten by a subsequent call to the localeconv function In addition calls to the setlocale function with categories LC ALL LC MONETARY or LC NUMERIC may overwrite the contents of the structure SEE ALSO locale 5P NOTE TO USERS This entry is reprinted from the INTERACTIVE SDS Guide and Programmer s Reference Manual INTERACTIVE UNIX System 3 International Supplement nl langinfo 3P nl langinfo 3P NAME nl langinfo language information SYNOPSIS include nl types h include langinfo h gt char nl_langinfo item nl item item DESCRIPTION The nl langinfo function returns a pointer to a string containing infor mation relevant to the particular language or cultural area defined in the program s locale The manifest constant names and values of item are defined in the file usr include langinfo h For example nl langinfo ABDAY 1 would return a pointer to the string Dom if the identified language was Portuguese and Sun if the identified language
166. res that old applications do not suddenly start behaving strangely In addition a particular locale instance that describes the desired behaviour must also have been created Such instances are referred to by their name X Open has adopted a format for constructing locale names that makes them easy to identify The format is languagel territory codeset where language is a two letter abbreviation for example fr for French territory is a two letter abbreviation FR for France or CA for Canada for example and codeset is the codeset designation such as 437 One locale category is always present the or POSIX locale which defines the traditional UNIX System behaviour The creation of locale instances is described the Interna tional Supplement Manual for Advanced Users 30 International Supplement User s Manual 7 2 Controlling the International Environment A programmer can set and change the 1ocale explicitly inside a program This can be done to ensure a particular environment for example so that a particular program always behaves the same way In most cases however the programmer leaves the choice to the end user by specifying that the locale be set to what the end user specified via environment variables The environment variables are LC ALL LC_COLLATE LC_CTYPE LC_MESSAGES LC_MONETARY LC_NUMERIC LC_TIME LANG If this environment variable is set the environment is set t
167. rmative y and negative responses LC MONETARY Affects the monetary formatting information returned by the localeconv function LC NUMERIC Affects the decimal point character for the formatted input output functions and the string conversion functions as well as the non monetary formatting information returned by the localeconv function LC TIME Affects the behaviour of the strftime function The value LC ALL for category names all of the categories of the program s locale LC ALL is a special constant not a category The ocale argument is a pointer to a character string that can be an explicit string a NULL pointer or a null string When is an explicit string the contents of the string determines the locale The values POSIX or C for locale are reserved for the default locale which is the environment required for C translation and also corresponds with the System V default behaviour If setlocale is not invoked the program s locale is the default locale When the ocale is a NULL pointer the program s locale is queried according to the value of category The returned string contains the locale identifiers if the category is LC ALL the string contains semicolon separated locale identifiers Portable progams cannot rely on either the content or format of the returned string When the ocale is a null string the setlocale function takes the name of the new locale for the specified category from the environm
168. rmine which file to read as the next volume when an end of file or end of media con dition is encountered Answer Format Method Extended tar Prompts when ready for the next volume and asks the user to type go when ready to proceed There is no way to specify the device the initial device is used cpio Prompts that it has reached the end of the medium and asks the user to type the device file name for the next archive when ready Options Description of method used by each utility Refer to POSIX 1 Conformance Document Section 10 1 3 Rationale In many cases the utility will prompt the user for the path name of the device to use for the next volume There may be extensions to the utility syntax which allow the definition of alternate addresses for subsequent volumes Reference XPG3 Volume 3 Pages 151 2 Utilities Page 15 1 6 International Supplement Reference Manual CONTENTS chrtbl 1M colldef 1P gencat 1P iconv 1P loadfont 1 showcat 1P ttymap 1 catclose 3P catgets 3P catopen 3P localeconv 3P nl langinfo 3P setlocale 3P strcoll 3P strerror 3P strxfrm 3P gencat 4P loadfont 4 charmap 5P langinfo 5P locale 5P ron chrtbl 1M chrtbl 1M NAME chrtbl generate character classification and conversion tables SYNOPSIS chrtbl file DESCRIPTION The chrtbl command creates a character classification table and an upper lowercase conversion table The tables
169. rtain countries INTERACTIVE s distributors sell a special version of the product to accommodate these special markets Contact your sales representative for more information To find out how to set up a user to use the system in an interna tional environment refer to the International Supplement Manual for Advanced Users 2 International Supplement User s Manual 2 INTERNATIONALISATION Computers and their method of operation have generally been asso ciated with American English Until recently computer users and programmers accepted the fact that operating and programming a computer had to be in English Internationalisation is the art of making a computer a computer system or a computer program often called an application func tion in a non U S environment The word itself illustrates that the different behaviour a computer system must support not only depends on the use of a different language but also on the country of origin even if the language is the same Spelling may be different for example in American English the word is spelled internationalization while in England the spelling is internationali sation To avoid the spelling problem the acronym J 8N is becom ing common whether in the U S or England internationalisation begins with the letter I ends with N and has 18 letters in between When the word internationalisation is brought up in a conversation people often react with comments such as
170. rtain keys on typewriters behave differently from the others because when these keys are pressed the carriage of the type writer does not move is such a character for example When it is followed by an e the letter is generated This is called a deadkey or a non spacing character The tty subsystem sup ports the use of deadkeys Typically the character and the umlaut character are used as deadkeys e Compose sequences Characters can also be generated using compose sequences A dedicated character called the compose character fol lowed by two other keystrokes generates a single character As an example COMPOSE followed by the plus sign and the minus 4 International Supplement Manual for Advanced Users sign could generate the plus minus sign Compose sequences can also be used as an alternative for deadkeys for example COMPOSE e instead of e alone e Decimal representation Rarely used characters can be generated by pressing COMPOSE followed by three digits which are the decimal representation of the character This feature has been added by INTERACTIVE This should alleviate most of the inconvenience caused by the 1K limitation of the mapping buffer e Toggle key An optional toggle key can be defined to temporarily disable the current mapping at any time This can be useful when a German programmer wants easy access to the curly braces and the brackets A toggle key is also used by Greek users to switch be
171. s Strings must be enclosed in quotation marks individual characters can be so enclosed but it is not required Lines starting with a are ignored The following keywords are recognised LC MESSAGES This keyword must be the first in the file yesexpr The value is a regular expression used to evaluate an affirmative response The regular expression must be enclosed in quote marks noexpr The value is a regular expression used to evaluate a negative response The regular expression must be enclosed in quote marks END LC MESSAGES This keyword must be the last in the file Example LC MESSAGES yesexpr Yy alpha noexpr Nn END LC MESSAGES LC MONETARY The information in the LC MONETARY file is in text format Each line in the text file contains a keyword and a value separated by space s or tab s Strings must be enclosed in quotation marks indi vidual characters can be so enclosed but it is not required Lines starting with a are ignored For a detailed definition of the values see localeconv 3P The following keywords are recognised LC MONETARY This keyword must be the first in the file int curr symbol The value is the four character string to be used as international currency symbol enclosed in quote marks currency symbol The value is the character used as currency symbol mon decimal point The value is the decimal delimiter used to for mat monetary values mon thousands sep The
172. s For example mon grouping 35 0 6 7 6 positive sign negative sign Keywords The operand is a string used to indicate positive or negative values For example positive sign negative sign 6 7 7 int frac digits Keyword This keyword is an integer that represents the number of fractional digits those to the right of the decimal delimiter to be displayed in a formatted monetary quantity using int curr symbol For example int frac digits 2 6 7 8 frac digits Keyword This keyword is an integer that represents the number of fractional digits those to the right of the decimal delimiter to be displayed in 40 International Supplement Manual for Advanced Users a formatted monetary quantity using currency symbol For example frac digits 2 6 7 9 p cs precedes n cs precedes Keywords Each keyword is an integer that is set to 1 if the currency symbol precedes the value for a positive or negative formatted monetary quantity respectively and set to 0 if the sym bol succeeds the value For example p cs precedes 1 6 7 10 p_sep_by_space n_sep_by_space Keywords Each keyword is an integer that is set to 1 if a space separates the currency symbol from the value for a positive or negative formatted monetary quantity respectively They are set to 0 if no space separates the symbol from the value 6 7 11 p_sign_posn n_sign_posn Keywords Each keyword is an integer that is set to a value indicating the posi tioning of the positi
173. s will be used on the system The other criteria that should be considered in this decision are as follows International Supplement User s Manual 25 e If many files developed on a DOS system need to be processed or many applications will be used in the VP ix Environment an IBM codepage should be used e If the system needs to communicate with a heterogenous net work of computers an ISO 8859 codeset is the better choice All the files supporting international keyboards that are supplied with the INTERACTIVE UNIX Operating System which are located in usr lib keyboard configure the console to use the IBM codepage 437 850 for Norway Additional mapping files are provided as is with the International Supplement located in sub directories of usr lib keyboard They are named after the codeset 437 or 8859 1 for example and their names follow the X Open convention for locale names for example usr lib keyboard 8859 1 fr FR which represents the mapfile for French in France using the ISO 8859 1 codeset 5 6 1 Converting From One Codeset to Another The International Supplement contains a utility iconv which can be used to convert the encoding of characters in a file from one codeset to another The following example shows the command needed to convert the encoding in filename from the IBM codepage 437 to ISO 8859 1 iconv f 437 t 8859 filename gt file new Refer to iconv 1P for more details 26 International Supplem
174. s Reference Manual INTERACTIVE UNIX System International Supplement locale 5P locale 5P NAME locale define and set international environment DESCRIPTION A locale is made up from one or more categories Each category is identified by its name and controls specific aspects of the behaviour of components of the system Category names correspond to the follow ing environment variable names LC ALL Overrides the settings of all of the following environment variables LC COLLATE Affects the behaviour of the string collation functions LC CTYPE Affects the behaviour of the character handling functions LC MESSAGES Affects the interpretation of the strings associated with affirmative y and negative responses LC MONETARY Affects the monetary formatting information returned by the ocaleconv 3P function LC NUMERIC Affects the decimal delimiter character for the for matted input output functions and the string conversion functions as well as the non monetary formatting information returned by the Jocaleconv function LC TIME Affects the behaviour of the strftime function see ctime 3P LANG Provides a fallback value to be used if one of the above except LC_ALL is not set or is set to the empty string Programs compiled and linked with the Xp option can use the setlocale function to modify the environment When the program starts the environment is set to the C locale which corresponds to the tr
175. s must be done before a pro gram using the stored definitions is executed Note that the pro gram must be set up to check and set the international environment via the setlocale function In the INTERACTIVE UNIX Sys tem the standard utilities that depend on character classification such as grep 1s ed and sort have been modified to use the international environment However the vi program has not been modified to use the international environment it uses the informa tion in the lib chrclass directory and the value of the environment variable CHRCLASS Refer to section 9 THE SYS TEM V ENVIRONMENT in the International Supplement User s Manual for more information 4 3 Creating a Character Classification Category Definition Character classification definitions are created using the chrtbl utility The source language for the character classification category in the INTERACTIVE UNIX Operating System allows the 16 International Supplement Manual for Advanced Users user to define the name of the data file created by chrtbl the assignment of characters to character classifications and the rela tionship between uppercase and lowercase letters The character classifications recognised by chrtb1 are chrclass of the data file to be created by chrtbl isupper Character codes to be classified as uppercase letters islower Character codes to be classified as lowercase letters isdigit Character codes to be classified
176. s such as vi were basically useless for editing non English texts Beginning with UNIX System V Release 3 1 most utilities became what is called 8 bit clean The INTERACTIVE UNIX Operating System is based on UNIX System V Release 3 2 and therefore con tains these 8 bit utilities As 8 bit characters are now supported an 8 bit codeset can be used and the convention is to map 256 unique symbols to 256 unique numbers As might be expected more than one such codeset exists in the industry Fortunately all have one important feature in com mon the first 128 characters of these codesets are exactly the same as the characters in the ASCII codeset In other words they are all supersets of the ASCII codeset 22 International Supplement User s Manual 5 3 IBM Codepages The codeset used in IBM compatible personal computers is probably the single most popular codeset used today primarily by people who are not even aware that it is designed to support non English languages Until recently this codeset was referred to as IBM extended ASCII which is a very good description of what an 8 bit codeset is it extends the 128 character ASCII codeset by another 128 characters The characters used in this codeset and the way they are encoded are exactly those characters displayed by the sample program Show c used in section 5 2 8 bit Characters and Codesets If you run this program again and look at the output you will note the following
177. scription of exponent and mantissa precision and number of bits associated with the long double format Page 2 1 9 X Open Conformance Statement XCS QUE 3 2 Questionnaire Rationale The long double format can both vary in length and precision If it is supported other than as a synonym for double the format needs to be described TN Reference XPG3 Volume 2 Page 328 printf XPG3 Volume 2 Page 362 scanf 2 1 6 Data Encryption Question 8 Are the optional data encryption interfaces provided Answer crypt No encrypt No setkey No Rationale Normally an implementation will either provide all three of these routines or will provide none of them at all If the routines are not provided then the implementation must provide a dummy interface which always raises an ENOSYS error condition Reference XPG3 Volume 2 Page 3 Status of Interfaces Page 2 1 10 X Open Conformance Statement XCS QUE 3 2 Questionnaire Section 2 2 Process Handling 2 2 1 Process Generation Question 9 Which file types regular directory FIFO special etc are considered to be executable Answer Regular Options A list of the types of file that are considered to be executable Rationale The EACCES error associated with exec functions occurs in cir cumstances when the implementation does not support execution of files of the type specified A list of these file types needs to be provided Example Only regular fi
178. scriptor The operand must consist of seven strings separated by semicolons The first string must be the abbre viated name of the first day of the week Sunday the second string must be the abbreviated name of the second day and so on For example abday Sun Mon Tue Wed Thu Fri Sat 3 3 2 day Keyword This keyword is used to define the full weekday names correspond ing to the date field descriptor The operand must consist of seven strings separated by semicolons The first string must be the full name of the first day of the week Sunday the second string must be the full name of the second day and so on For example day Sonntag Montag Dienstag Mittwoch Donnerstag Freitag Samstag 3 3 3 abmon Keyword This keyword is used to define the abbreviated month names corresponding to the date field descriptor The operand must consist of twelve strings separated by semicolons The first string must be the abbreviated name of the first month of the year Janu ary the second string must be the abbreviated name of the second month and so on For example International Supplement Manual for Advanced Users 13 abmon Jan Feb Mar Apr Jun Jul Aug Sep Oct Nov Dec 3 3 4 mon Keyword This keyword is used to define the full month names corresponding to the date B field descriptor The operand must consist of twelve strings separated by semicolons T
179. ses of characters for example 10wer is a regular expression that means lowercase letter 34 International Supplement User s Manual The INTERACTIVE UNIX Operating System fully supports interna tionalised regular expressions Where appropriate UNIX System utilities have been enhanced to support these capabilities These util ities are supplied with the International Supplement see section 10 INTERNATIONALISED INTERACTIVE UNIX SYSTEM UTILI TIES For a detailed description of internationalised regular expressions refer to regexp 5P 8 3 Collation Collation according to a dictionary is the of putting things in their proper order Thus collation rules define how the data are put in the proper order or sorted Traditionally the collating order in the UNIX System has been ASCII order that is the order in which the characters appear in the ASCII codeset This is the natural collating order for the English language For most languages in the world however this is not enough Most European languages contain more letters than the 26 in the English language with the additional letters typically collating between the letters in the ASCII set For instance an accented sorts between a and b The average European user expects sorted lists for instance the output from the 1s command to appear in the collation order of his or her language Languages with non Latin based alphabets such as Russian or
180. signed to use the American Standard Code for Information Interchange ASCII standard for internal storage 5 1 ASCII ASCII is a convention or codeset describing one to one relation ships between symbols and numbers It represents letters as numbers that can be stored in 7 bits of the computer s memory which means a choice of 128 different symbols 0 to 127 The numbers 0 to 32 are reserved for characters that cannot be displayed on the screen but have a special meaning to the system so called nonprintable characters As an example 7 represents the sound a computer makes when you press g These charac ters are often referred to as control characters because the key is needed to generate them The smiling faces that can be pro duced on the console as discussed in the previous section are not part of the ASCII standard Only 7 bits of internal storage are needed to store 128 different numbers 0 127 so the ASCII codeset is called a 7 bit codeset 7 bit US ASCII The 96 printable ASCII characters are encoded as follows 20 International Supplement User s Manual 32 33 34 35 36 37 38 amp 39 40 41 42 43 44 45 46 47 48 0 49 1 50 2 513 52 4 53 5 54 6 55 7 56 8 57 9 58 59 60 61 62 gt 63 64 65 A 66 B 67 C 68 D 69 E 70 F 716 72 H 73 74 J 75 K 76 L 77 M 78 N 79 O 80 P 81 Q 82 R 83 S 84 T 85 U 86 V 87 W 88 X 89 Y 90 2 91 92 93 94 95 96 97 a 98 b 99 c 100d 101 102 f 1
181. specified in ISO 4217 Codes for the representation of currencies and funds The fourth character must be the character used to separate the international currency symbol from the monetary quantity normally a space For example int curr symbol FMK 6 7 2 currency symbol Keyword This keyword defines the string to be used as the local currency symbol For example currency symbol International Supplement Manual for Advanced Users 39 6 7 3 mon decimal point Keyword The operand is the character to be used as the decimal delimiter to format monetary quantities For example mon decimal point is the Portuguese monetary decimal delimiter 6 7 4 mon thousands sep Keyword This operand is the string to be used as the separator for groups of digits to the left of the decimal delimiter in formatted monetary quantities For example mon thousands sep 6 7 5 mon grouping Keyword This keyword is used to define the size of each group of digits in formatted monetary quantities The operand is a sequence of integers separated by semicolons Each integer specifies the number of digits in each group with the initial integer defining the size of the group immediately preceding the decimal delimiter and the fol lowing integers defining the preceding groups Grouping is per formed only for groups with a defined size unless the last integer is zero in which case the size of the last group is repeatedly used for the remainder of the digit
182. t Manual for Advanced Users The ellipsis symbol specifies that a sequence of characters collates according to their encoded character values that is all characters with a coded character set value higher than the value of the character in the preceding line and lower than the coded charac ter set value for the character in the following line are placed in the character collation order between the previous and the following character in ascending order according to their coded character set values An initial ellipsis is interpreted as if the line preceding it specified the NULL character and a trailing ellipsis is interpreted as though the line following it specified the highest coded character set value in the current coded character set An ellipsis is treated as invalid if the lines preceding or following it do not specify charac ters in the current coded character set Note that the use of the ellipsis symbol ties the definition to a specific coded character set and may preclude the definition from being portable The colldef utility issues a warning to this effect if an ellipsis is detected The explicit specification elsewhere of a character automatically included via an ellipsis symbol is treated as an error All characters not defined in the order sequence either explicitly or via an ellipsis are placed in the collation order via the special sym bol UNDEFINED All such characters are placed in to the existing order at the point
183. t file contains a keyword and one or more values The keyword is separated from the values by space s or tab s Values are separated by semicolons which can have spaces or tabs before or after them Strings must be enclosed in quotation marks individual charac ters can be so enclosed but it is not required Lines starting with a are ignored Lines can be continued by using a backslash at the end of the line The following keywords are recognised LC_TIME abday day abmon mon t_fmt d_fmt d t fmt am pm t fmt ampm END LC TIME INTERACTIVE UNIX System This keyword must be the first in the file Defines the abbreviated names of the weekdays starting with Sunday Defines the names of the weekdays starting with Sunday Defines the abbreviated names of the months starting with January Defines the names of the months starting with January Defines the format of the time string using the strftime conversion specifiers see ctime 3P Defines the format of the date string using the strftime conversion specifiers see ctime 3P Defines the format of the combined date and time string using the strftime conversion specifiers see ctime 3P Defines the strings used to represent ante meri diem and post meridiem in that order Defines the format of the time string in 12 hour format This keyword must be the last in the file 5 International Supplement locale 5P locale 5
184. tation catopen will fail if EINVAL 1 name contains a slash and exists but is not a message catalogue or 2 name does not contain a slash a message catalogue was not found using NLSPATH and the system default lib locale ISC msgcat name exists but is not a mes sage catalogue INTERACTIVE UNIX System 1 International Supplement catopen 3P catopen 3P ENOMEM a storage space is available for internal buffer areas The following are possible failures from the underlying fopen 3 of the message catalogue EACCES ARN Search permission is denied on a component of the path prefix or the file exists and the permissions specified by mode are denied or the file does not exist and write permission is denied for the parent directory of the file to be created EINTR A signal was caught during the fopen function EMFILE file descriptors directories and message catalogues are currently open in the calling process ENAMETOOLONG The length of the filename string exceeds _ or a ee name component is longer than NAME_MAX while OSIX_NO_TRUNC is in effect ENFILE The system file table is full ENOENT The named file does not exist or the filename argument points to an empty string d ENOTDIR A component of the path prefix is not a directory ENXIO The named file is a character special or block special file and the device associated with this special file d
185. ter the same number of pixels is used For the standard video cards the different resolutions supported all or a subset are 8 by 8 8 horizontally and 8 vertically 8 by 14 8 by 16 INTERACTIVE UNIX System 2 International Supplement loadfont 1 loadfont 1 When oadfont is invoked to modify the existing font it will attempt to do so for the font size currently in use Use the m option to switch to another font size loadfont and ttymap There is an almost one to one relationship between the use of the oad font utility and the ttymap utility Whereas loadfont is used to list or modify the images that correspond with the various characters the ttymap utility is used to determine how characters are generated from the keyboard and which code a single byte code will be used to represent the character internally The default representation is the IBM extended ASCII codeset often also referred to as IBM codepage 437 A ttymap sample input file is supplied that can be used for this codeset on a console with a U S keyboard usa map When a different keyboard is used a different ttymap input file is required e g french map for a French keyboard When a different codeset is used both a different ttymap input file and a different font are required For the most popular codesets fonts are hard coded into the loadfont program for the 8 by 16 resolution see Fonts If these fonts do not satisfy your needs beca
186. ternational Supplement contains internationalised versions of the most popular UNIX System utilities such as date sort and 15 When using these utilities users see the date displayed in their own language and can sort text files using the dictionary order of any supported language they specify The International Supplement also adds to the INTERACTIVE UNIX Operating System the functionality needed to make it fully compliant with X Open Company Limited s Issue 3 of the X Open Portability Guide XPG3 available from Prentice Hall This guide contains practical standards for application portability as adopted by X Open Company Limited This international group of hardware manufacturers and software vendors has defined a Com mon Applications Environment CAE that is built on the interfaces to the UNIX Operating System Compliance with this CAE is now a requirement when systems are offered to most governments and corporations The International Supplement Guide includes International Supplement Overview and Installation Instructions Provides a general overview of this guide information about in stallation requirements and references and conventions used e International Supplement User s Manual Provides a comprehensive description of how the INTERACTIVE UNIX System can be used in non U S environments Among other things it discusses how to use different keyboards and how to correctly use UNIX System utilities 2 International S
187. the language of choice to make the key board function properly The substitution characters can still be generated but not displayed see section 6 DISPLAYING DATA To accommodate program mers who use such terminals a new feature called trigraphs has been introduced into the ANSI C language Trigraphs are three letter sequences used in an ANSI C source file that are interpreted as a single symbol essential to the C language This allows a pro grammer who uses an Italian 7 bit terminal for example to still get the job done The one to one relationship between trigraphs and the symbols they represent is listed in the table below Trigraph Symbol Represented 29 9 2 27 lt 22 gt t oe gt Note that this feature is not available with the traditional Ker nighan and Ritchie C compiler 4 6 Using the VP ix Environment The Virtual Personal computer Interactive eXecutive environment VP ix is a product developed and sold by INTERACTIVE Sys tems Corporation It is a UNIX System application that emulates an IBM PC XT compatible computer which allows users of the INTERACTIVE UNIX Operating System to run DOS and DOS 18 International Supplement User s Manual applications as if they were UNIX System utilities A copy of DOS is furnished with the product and is used by default whenever vpix the name of the actual command is invoked When the VP ix Environment is used
188. the symbolic names and values used in the ASCII charmap 2 The charmap can only contain entries describing single byte characters between the CHARMAP and END CHARMAP statements The default location for charmap files used by colldef and iconv is lib charmap a charmap file in any other directory must be specified by a path name containing a slash The format of a charmap file is as follows declarations CHARMAP This is the charmap header regular entries These are the regular single byte coded charac ter set descriptions END CHARMAP Defines the end of the charmap AN EXTENDED CHARMAP Starts optional section defining sequences of one or more bytes to be treated as characters by the command extended entries These are the extended charmap entries END EXTENDED CH Defines the end of the extended charmap section INTERACTIVE UNIX System 1 International Supplement charmap SP charmap SP The following is a description of the permissible entries in each section and their format DECLARATIONS The following optional declarations can precede the character definitions Each declaration consists of the symbol shown in the fol lowing list starting in column 1 including the surrounding brackets followed by one or more spaces or tabs followed by the value to be assigned to the symbol X code set name The name of the coded character set for which the character set description file is defined Only char
189. tial integer defining the size of the group immediately preceding the decimal delimiter and the following integers defining the preceding groups Grouping is performed only for groups with a defined size unless the last integer is zero in which case the size of the last group is used repeatedly for the remainder of the digits As an example of the interpretation of the grouping keyword assume that the value to be formatted is 123456789 and the thousands sep is The following are the results with the various groupings shown 36 International Supplement Manual for Advanced Users grouping Formatted Value 3 123456 798 3 0 123 456 789 3 2 1234 56 789 3 2 0 12 34 56 789 6 5 4 An Example of a Numeric Category Definition LC NUMERIC decimal point M thousands sep grouping 3 0 END LC NUMERIC 6 5 5 How a Program Uses This Information If a program needs to access the values in the current locale it can do so via the library interfaces localeconv and nl langinfo Refer to localeconv 3P and nl Janginfo 3P for more information 6 6 Monetary Editing Monetary editing controls the appearance of monetary numbers Note that no standard INTERACTIVE UNIX System library rou tines or commands take into account monetary editing The follow ing aspects of monetary editing are controlled via the LC MONETARY locale category 1 The character used as a monetary decimal delimiter 2 The number of fractional digits 3 The character
190. tively precedes or succeeds the value for a non negative formatted monetary quantity char p sep by space Set to 1 or 0 if the currency symbol respectively is or is not separated by a space from the value for a non negative for matted monetary quantity char n cs precedes Set to 1 or 0 if the currency symbol respectively precedes or succeeds the value for a negative formatted monetary quantity char n sep by space Set to 1 or 0 if the currency symbol respectively is or is not separated by a space from the value for a negative formatted monetary quantity char p sign posn Set to a value indicating the positioning of the positive sign for a non negative formatted monetary quantity char n sign posn Set to a value indicating the positioning of the negative sign for a negative formatted monetary quantity The elements of grouping and mon grouping are interpreted according to the following CHAR MAX No further grouping is to be performed 0 The previous element is to be repeatedly used for the remainder of the digits other The integer value is the number of digits that comprise the current group The next element is examined to determine the size of the next group of digits before the current group The value of p sign posn and n sign posn is interpreted according to the following INTERACTIVE UNIX System 2 International Supplement localeconv 3P localeconv 3P Parentheses surround the quantity and cur
191. to activate character mapping on input and output for the user s terminal This same utility can be used for regular terminals as well as for scancode devices such as the AT console It makes full use of all the features of the terminal tty driver and the keyboard display driver that support such mapping The command mapfile reads the contents of the file mapfile and sets the corresponding mapping as supported by the terminal driver and or keyboard display driver The layout of the mapfile and the functionality supported by both drivers are described below ttymap d disables the current mapping by the terminal driver ttymap r resets the scancode translation back to that of a U S PC keyboard Terminal Mapping The original UNIX operating system was written to support the ASCII codeset ASCII is one of many standards to represent a number of characters internally as certain numbers Typical for ASCII is that it supports 128 different characters each represented by a single byte of which the 8 bit is not used Many UNIX system applications includ ing the shell took advantage of this Starting with UNIX System V Release 3 1 most of these applications have been mogified to properly support characters represented as a byte with the 8 bit set as well This means that now 256 characters can be supported at the same time However a consistent coding convention needs to be applied In the IBM PC world an 8 bit coding referre
192. trcoll 3P strcoll 3P NAME strcoll string comparison using collating information SYNOPSIS include lt string h gt int strcoll 51 52 char 51 s2 DESCRIPTION The strcoll function compares the string pointed to by s1 to the string pointed to by 52 both interpreted as appropriate to the LC COLLATE category of the current locale see ocale 5P The sign of a nonzero value returned by strcoll is determined by the relative ordering within the current collating sequence of the first pair of characters that differ in the objects being compared RETURN VALUE Upon successful completion the strcoll function returns an integer greater than equal to or less than zero according to whether the string pointed to by 51 is greater than equal to or less than the string pointed to by s2 when both are interpreted as appropriate to the current locale On error strcoll sets errno but no return value is reserved to indicate an error ERRORS The strcoll function may fail if EINVAL The s1 or 52 argument contains characters outside the domain of the collating sequence NOTE The strxfrm 3P and strcmp see string 3P functions should be used for sorting large lists SEE ALSO strxfrm 3P string 3P in the INTERACTIVE SDS Guide and Programmer s Refer ence Manual NOTE TO USERS This entry is reprinted from the INTERACTIVE SDS Guide and Programmer s Reference Manual INTERACTIVE UNIX System 1 International Supplement C
193. tten usually in a different computer language In the late seventies and early eighties with the advent of the UNIX Operating System this situation changed dramatically This very portable operating system became available on a variety of hardware and supported a common new language C There was still room for improvement however Most implementa tions of the UNIX Operating System were actually different flavors with different features The C programming language by itself was simply a definition of a language Supplying libraries with functions like printf which software developers could immediately use in their programs was the responsibility of compiler vendors As far as interfacing with terminals and databases there was no standard at all As a result despite the UNIX Operating System porting applications modifying the source program of an application to make it work on a different computer system required a lot of effort and experienced programmers Porting became a separate skill 3 2 Standardisation and the Portability Guide Many standards committees as well as AT amp T the developer of the UNIX System tried to achieve a higher level of standardisation and compatibility AT amp T published the first issue of their System V Interface Definition SVID describing all the features of the UNIX Operating System that would be maintained new ones that would be introduced and old ones that would disappear in the next release
194. tter will actually appear to be a curly brace Therefore input and output mapping should be supported by the terminal driver to allow the consistent use of one single codeset throughout the system The INTERACTIVE UNIX Operating System supports all mapping features that are now standard in the System V Release 3 2 terminal driver as well as some enhancements by INTERACTIVE Systems Corporation Input mapping On input any byte can be mapped to any byte Using the example above you could map 123 to 130 the code used for in the IBM extended ASCII codeset Output mapping On output any byte can be mapped to either a byte or a string In the above example 130 would be mapped back to 123 to properly display the character on the screen If the connected device is a printer that does not support the char acter it could be mapped to the string BACKSPACE Dead keys On typewriters keys can be found that behave slightly differently than all the others because when you press them the printing wheel of the typewriter does not move CTRL is such a character When it is followed by an e the letter ime is generated This is called a deadkey or a non spacing char acter The terminal driver supports the use of deadkeys Typ ically the character and the umlaut character are used as deadkeys Compose sequences Characters can also be generated using a compose sequence A dedicated character called the compose
195. tuted for see environ SP for the description of NLSPATH from the X Open Portability Guide Volume 2 XSI System Interface and Headers If NLSPATH does not exist in the environment or if a mes sage catalogue cannot be opened in any of the components specified by NLSPATH then the default used by this implementation is lib locale ISC msgcat name In this implementation catopen makes the following interpretations with respect to the processing of NLSPATH 1 If the result from evaluating a a 961 or a t substitution field NLSPATH exceeds NL LANGMAX characters see the file ff usr include limits h it will be truncated to NL LANGMAX characters 2 The result from evaluating a template in NLSPATH must not exceed PATH MAX characters see usr include limits h 3 A 96 in NLSPATH not followed by a defined keyword or another will be ignored The FD CLOEXEC flag will be set for the file descriptor underlying the message catalogue descriptor The oflag argument is reserved for future use and should be set to 0 zero The results of setting this field to any other value are undefined RETURN VALUES Upon successful completion catopen returns a message catalogue descriptor for use on subsequent calls to catgets 3P and catclose 3P Otherwise catopen returns nl_catd 1 and sets errno to indicate the error unless the message catalogue is corrupted in which case errno may not be set ERRORS In this implemen
196. tween ASCII and Greek The toggle key feature and the ioctl calls that implement this are INTERACTIVE enhancements 2 3 The ttymap Program ttymap is an INTERACTIVE utility that permits a user to activate character mapping for the user s terminal on input and output This utility can be used for regular terminals as well as for scancode de vices such as the AT console It makes full use of all the features of the terminal tty driver and the keyboard display driver that support such mapping The keyboard of the console differs from the keyboards used with regular terminals in two ways they contain a number of keys such as the key that are not found on regular terminals and they generate scancodes rather than ASCII or extended ASCII codes Scancodes generated by PC keyboards typically represent the loca tion of the key on the keyboard the keyboard driver has to properly translate these scancodes Without changing the scancode transla tion if French users type an A they see a Q on the screen Several status keys can influence the translated code as well The keyboard driver and thus the ttymap program make a distinction between two sets of key combinations that can be translated e Regular keys e Function keys Up to 60 key combinations are recognised as function keys The first 12 are the 12 function keys of a 101 key PC keyboard International Supplement Manual for Advanced Users 5 13 to are the same keys used in combin
197. ual entries associated with that product or as stated in the documentation Manual entries referred to in this guide may be found in either the International Supplement Reference Manual in this guide the INTERACTIVE SDS Guide and Programmer s Reference Manual that accompanied your INTERACTIVE Software Development System make special note of ctime 3P perror 3P printf 3P scanf 3P environ 5P and regexp5P or the INTERACTIVE UNIX System User s System Administrator s Reference Manual that accompanied your INTER ACTIVE UNIX Operating System w 5 International Supplement User s Manual CONTENTS INTRODUCTION INTERNATIONALISATION THE X OPEN PORTABILITY GUIDE Computer Applications and Portability Standardisation and the Portability Guide Common Applications Environment Standard Portable x Interface POSIX 1 A POSIX 2 The INTERACTIVE UNIX X Operating System ENTERING DATA 4 1 4 2 5 1 5 2 5 3 5 4 5 5 U S Personal Computer Keyboard Layout Generating Characters Not Present on a U S Keyboard 4 2 1 Deadkeys 4 2 2 Composing Characters Using Compose Sequences 4 2 3 Decimal Representation 4 2 4 Smiling Faces European Personal Computer Keyboard Layouts Cyrillic or Greek Keyboards Keyboard Layouts on 7 bit Terminals Using the VP ix Environment Entering Data and od INTERACTIVE X11 STORING DATA IN THE COMPUTER ASCII 8 bit Charact
198. uide and Programmer s Reference Manual DIAGNOSTICS The error messages produced by chrtbl are intended to be self explanatory They indicate errors in the command line or syntactic errors encountered within the input file NOTE TO USERS This entry is reprinted from the INTERACTIVE UNIX System User s System Administrator s Reference Manual INTERACTIVE UNIX System 3 International Supplement colldef 1P colldef 1P NAME colldef generate collation table SYNOPSIS colldef c fcharmap iinputfile s locale DESCRIPTION The colldef utility converts collation source definitions into a format usable by the strcoll 3P and strxfrm 3P functions as well as in sort ing and regular expression processing The colldef command has the following options c A collation table is created if warning messages have been issued Normally both error and warning messages cause the command to terminate without creating the collation table f charmap The path name of a file containing a mapping of charac ter symbols and collating element symbols to actual character encodings This option must be specified if symbolic names other than collating symbols defined in a collating symbol keyword are used If the name does not contain a the program will assume that the char map is located in the directory lib charmap i inputfile The path name of a file containing the source definitions If this option is
199. uotes Individual characters may be surrounded by quotes but it is not required Blank lines or lines containing a number sign in the first column are ignored The following keywords are recognised LC NUMERIC The header decimal point Defines the decimal delimiter character thousands sep Defines the thousands separator character International Supplement Manual for Advanced Users 35 grouping Defines the grouping of digits END LC NUMERIC The trailer 6 5 1 decimal point Keyword This keyword specifies the character to use as the decimal delimiter in the editing of floating point numbers both on input and output The format is decimal point character where character is the character chosen as the decimal delimiter 6 5 2 thousands sep Keyword This keyword specifies the character to be used as the thousands separator The format is thousands sep character where character is the character chosen to separate groups of digits to the left of the decimal delimiter in formatted nonmonetary quan tities Note that none of the standard INTERACTIVE UNIX Sys tem subroutines or commands recognises a thousands separator 6 5 8 grouping Keyword The grouping keyword defines the size of each group of digits in formatted nonmonetary quantities The format is grouping digit digit where the operands are integers separated by semicolons Each integer specifies the number of digits in a group with the ini
200. upplement Overview and Installation e International Supplement Manual for Advanced Users This manual is intended for system administrators programmers and other advanced users It describes how to set up a user s international environment to correctly enter data on the key board use UNIX System utilities and run internationalised applications It describes the format of collation tables and character classification tables and tells how they should be installed It also gives a brief overview of the facilities that need to be added to a C source program to give the resulting applica tion internationalised capabilities e X Open Conformance Statement Questionnaire Provides the information required to describe the conformance of the INTERACTIVE UNIX Operating System with X Open Com pany Limited s Issue 3 of the X Open Portability Guide e International Supplement Reference Manual Includes most of the relevant utilities and new library routines referred to in this guide Although many of these entries are also present in the documentation for the INTERACTIVE UNIX Operating System users and system administrators can now gen erally find them in one centralised place Manual entries for the internationalised versions of UNIX System commands be em found in Volume 1 of the X Open Portability Guide Issue 3 International Supplement Overview and Installation 3 2 INSTALLATION INSTRUCTIONS The International Supplement
201. use you want to use a different font size or because a customized font is required e g a Greek font loadfont description file to be used with the f option is needed A sample file that describes the IBM extended ASCII font for an 8 by 16 resolution is supplied vga437 bdf A second sam ple file 646g bdf contains a font file for German ASCII See ttymap 1 and loadfont 4 for additional details WARNING When an attempt is made to switch to a mode that the video card does not support e g a switch to EGA on a VGA card that has no EGA mode you will get a blank screen There is nothing wrong with the system simply type in the command to set the mode back e g loadfont m V80x25 FILES usr lib loadfont vga437 bdf sample Bitmap Distribution Format BDF file for IBM 437 font on a VGA usr lib loadfont 646g bdf sample BDF file for German ASCII SEE ALSO ttymap 1 display 7 in the INTERACTIVE UNIX System User s System Administrator s Reference Manual loadfont 4 in the INTERACTIVE SDS Guide and Programmer s Reference Manual NOTE TO USERS This entry is reprinted from the INTERACTIVE UNIX System User s System Administrator s Reference Manual INTERACTIVE UNIX System 3 International Supplement showcat 1P showcat 1P NAME showcat generate a message catalogue source file from a binary mes sage catalogue SYNOPSIS showcat msgfile catfile DESCRIPTION showcat generates a message catalogue source fi
202. uts In Europe computers are sold with either U S keyboards to be used with very technical engineering style applications usually in English or keyboards designed for the local country These key boards differ from U S keyboards in the following ways e Keyboard layout e 102 rather than 101 keys The extra key is usually located between the key and the leftmost bottom row key Z on a U S keyboard In most countries this key has the angle bracket characters lt and gt printed on it In addition the backslash key X on U S keyboards typically the rightmost or second rightmost key in the top row of the central key board section is usually moved to the left of the key in the 12 International Supplement User s Manual third row see Figure 1 The layout usually is the same as the one found on typewriters used in these countries They are often named after the order of the first five keys on the second row of keys key boards used in France are called AZERTY keyboards and keyboards used in Germany are called QWERTZ keyboards International Supplement User s Manual Figure 1 French Personal Computer Keyboard Layout 14 International Supplement User s Manual Most Western European languages have an alphabet that contains only a few more letters than English usually not more than 12 For example French uses all the letters used in English as well as a number of accented characters such as and Some of
203. ve_sign or negative_sign for a positive or negative formatted monetary quantity respectively The following integer values are recognised 0 Parentheses enclose the quantity the currency symbol 1 The sign string precedes the quantity and the currency symbol 2 The sign string succeeds the quantity and the currency symbol 3 The sign string immediately precedes the currency_symbol 4 The sign string immediately succeeds the currency_symbol International Supplement Manual for Advanced Users 41 6 7 12 An Example of a Monetary Category Definition LC MONETARY int_curr_symbol CHF currency symbol SFrs mon decimal point mon thousands sep mon grouping positive sign negative sign int frac digits frac digits P_cs_precedes P_sep_by_space n_cs_precedes see n sep by space P Sign posn n sign posn NR ORB OONN s sU END LC_MONETARY With the above definition a monetary quantity should be edited as follows Positive SFrs 1 234 56 Negative SFrs 1 234 56C 6 7 13 How a Program Uses This Information If a program needs to access the values in the current locale it can do so via the library interfaces localeconv and nl langinfo Refer to localeconv 3P and nl langinfo 3P for more information 42 International Supplement Manual for Advanced Users 7 SPECIFYING YES NO RESPONSE INFORMATION The yes no response category determines the correct string to be used as affirmative
204. very soon Volume 3 of XPG3 XSI Supplementary Definitions contains a sec specifically about internationalisation which defines the requirements and pieces together the I18N features in XPG3 3 6 The INTERACTIVE UNIX Operating System The INTERACTIVE UNIX Operating System is fully compliant with the POSIX 1 Standard and with XPG3 The International Sup plement adds to the INTERACTIVE UNIX Operating System the items needed for full compliance with the X Open standard where appropriate for an operating system its utilities and its interface to the C language The supplement contains a set of UNIX System utilities that have been enhanced to function according to the description of volume 1 of XPG3 These utilities and their new features are described in section 10 of this document The combination of the following software provides customers with a system that is fully compliant with the X Open standard and that will be branded with the X Open BASE logo INTERACTIVE UNIX Operating System e INTERACTIVE Software Development System e International Supplement The full seven volume X Open Portability Guide is now published by Prentice Hall and is available in specialized bookstores This set 6 International Supplement User s Manual is the only official and complete documentation for the X Open standard The documentation supplied with the International Sup plement focuses on internationalisation issues only Internationa
205. vided with the implementation Answer All are provided Options A list of utilities that are not provided Rationale The XPG Volume 1 states that this volume in its current form is useful only as a guide to portability but it is not possible to pre cisely define or test conformance to it This question determines whether or not the implementation provides a command of the name specified in the XPG it does not attempt to determine whether it supports the semantics of that command The optional develop ment utilities are excluded from this question and are dealt with in the next section of the questionnaire Example The mailx and newgrp commands are not provided Reference XPG3 Volume 1 Page 1 Introduction Page 3 1 1 X Open Conformance Statement XCS QUE 3 2 Questionnaire 3 1 2 Command Behaviour Question 2 In what ways do the commands provided by the implemen tation behave differently from the specifications contained in the XPG Answer The commands behave in the manner specified for each of the com mand options detailed in the XPG Options 1 The commands behave in the manner specified for each of the command options detailed in the XPG 2 A list of deviances for each of the commands is provided This list should be in a tabular form giving the name of the com mand the command option and a description of the deviant behaviour Rationale This question provides a greater degree of granularity th
206. y unless an option to prevent this translation was used when gencat was used to create the message catalogue Refer to showcat 1P for more information The following example lists the source of the famous hello c program when fully internationalised define XOPEN SOURCE include lt stdio h gt include lt locale h gt include lt nl_types h gt main argc argv int argc char argv nl_catd catd setlocale LC_ALL catd catopen argv 0 0 printf s n catgets catd NL_SETD 1 hello world catclose catd The message catalogue source looks like this set 1 1 hello world 8 8 1 Extension of printf Syntax The example shown handles a simple case of a message catalogue a string without parameters to be filled in However many messages do have parameters When text is translated the words in the translated version often have to be in a different order than in the original because of grammatical differences For example in English adjectives precede nouns white lady a cocktail whereas in French they usually follow nouns dame blanche a famous ice International Supplement Manual for Advanced Users 49 cream dish When program messages are translated and the pro gram uses printf X Open extensions provided in the INTER ACTIVE UNIX System can be used to indicate the order Normally conversions in a format string are performed in the order they are specified in the format statement th
207. y on the keyboard and the characters are generated by the terminal hardware International Supplement User s Manual 11 4 2 3 Decimal Representation A third method of generating characters is using their decimal representation As explained in section 5 STORING DATA IN THE COMPUTER every character corresponds to a unique number Up to 256 different characters can be used although some terminals only support 128 When the key is used fol lowed by three digits the character that is internally represented by the three digit number in decimal is generated This feature is also derived from the DOS system Press the key sequence followed by 065 and an A appears on the screen 65 is the decimal value used by computers to store the uppercase letter A Press the key sequence followed by 136 and the letter appears If you type ttymap d all deadkeys and compose sequences are disabled 4 2 4 Smiling Faces Those familiar with personal computers and certain DOS applica tions may have seen interesting images the size of a character such as smiling faces or musical notes When control characters are used characters generated by pressing and a letter key simultane ously normally nothing is displayed on the screen However when the key is pressed before pressing CTRL an image appears on the screen note that this only works on the console For exam ple produces a smiling face 4 3 European Personal Computer Keyboard Layo
208. yes no response category in the INTERACTIVE UNIX Operating System is the language defined by the POSIX 2 group for the LC MESSAGES category A yes no response source definition consists of a header a response body and a trailer The header is the word LC MESSAGES The trailer is the string END LC MESSAGES The response body consists of one or more lines of text Each line contains a keyword followed by one or more operands Keywords are separated from the operands by one or more blank characters space or tab International Supplement Manual for Advanced Users 43 Operands are characters strings of characters or digits When a keyword is followed by more than one operand the operands must be separated by semicolons Blank characters are allowed before and or after a semicolon Strings must be surrounded by quotes Individual characters may be surrounded by quotes but it is not required Blank lines or lines containing a number sign in the first column are ignored The following keywords are recognised LC MESSAGES The header yesexpr Defines the affirmative yes response noexpr Defines the negative no response END LC MESSAGES The trailer 7 4 1 yesexpr Keyword This keyword specifies the character or string to use as the affirmative yes response The format is yesexpr regular expression where regular expression is a regular expression which when used to match affirmative responses will report a match
209. yntax is class name where class name is the name of one of the following alpha a letter upper an uppercase letter lower lowercase letter digit a decimal digit xdigit a hexadecimal digit 18 alnum Space punct print graph cntrl International Supplement Manual for Advanced Users an alphanumeric letter or digit a character that produces white space in displayed text a punctuation character a printing character a character with a visible representation a control character For example the following command will find all file names in the current directory that begin with an uppercase letter 1s upper These specifications are primarily intended to replace the current use of expressions like 2 which are not portable Z is not the last letter in all alphabets International Supplement Manual for Advanced Users 19 5 PREPARING AND INSTALLING A COLLATION SEQUENCE A collation sequence specifies how characters and collating elements should be sorted that is the order between characters and collating elements Collation sequences are created using the colldef pro cessor refer to colldef 1P for more information This section describes how to set up a source collation sequence definition and use it to create a collation sequence Once the source definition is created and tested you can use it to create object collation sequences which are stored in a file named COLLATE

Download Pdf Manuals

image

Related Search

Related Contents

取扱説明書 (PDF) - 超小型CCDカメラとファイバースコープ    Electronic Service Manuals - Commercial Floor Machine Parts  TRENDnet TEW-637AP User's Manual  Manual Tecnico  NI PXIe-1075 Power Supply Shuttle User Guide  Nilfisk-ALTO POSEIDON 2-19 X Nilfisk-ALTO  Fagor VFI-400 I hob  ダウンロード(PDF 0.54MB)  PARTENAIRE DU PASS : MODE D`EMPLOI  

Copyright © All rights reserved.
Failed to retrieve file