Home
NonStop SQL, A Distributed, High-Performance, High
Contents
1. APPLICATION PROCESS APPLICATION X DISK SERVERS Files record Partitions ADMINISTRATION Indices Authorization Transactions Buffering FILE STRUCTURES oe B tree relative records or blocks l ocking logging MEDIA MANAGEMENT space allocation duplexing main memory cache _ gt nana Figure 2 The division of labor in Enscribe between the application process the file system and the disk processes The file system runs as a protected subsystem of the application process Disk processes run in a processor connected to the disk they manage In general the file system communicates with many disk processes and the corresponding disks This figure can be compared to Figure 6 which shows the division of labor in the NonStop SQL system Transaction Management Files may be designated as audited transaction protected either when they are created or at a later time Updates to audited files are automatically protected by locks belonging to the transaction and by undo and redo log records An application program can issue BeginTransaction which assigns a unique transaction identifier This identifier is automatically included in all messages sent by the application and by its servers All updates to audited files generate log records and locks tagged with this transaction identifier Eventually the application calls EndTransaction which commits the transaction if possible or it 6 calls
2. the SQL executor will recompile the statement and pick a plan not using the index There may be no reasonable plan in this case sequential scan of a terabyte is not reasonable Users can disable automatic recompilation of a program In this case the program will get an error if the plan is invalid Compilation on Demand Some programs are generic they run against a different database on each invocation Defines are used to remap the table names on each invocation Tandem s performance monitor which analyzes a performance measurement gives a classic example of this each measurement is a different set of tables When such a program is invoked the default is to recompile all its SQL statements Since the performance monitor has about one thousand SQL statements it has a lengthy and expensive startup If most invocations of the program involve only a small fraction of the SQL statements much of this SQL compilation effort is wasted So there is an option to compile invalid SQL statements on demand when used rather than recompiling all of them at process initiation Run Time Statistics and Error Reporting The NonStop SQL programmatic interface has two control blocks which feed information back to the application program Detailed diagnostics are provided in the SQL Communications Area SQLCA in case of error Any SQL error is reported by an error code along with up to seven diagnostic messages A routine is provided to format these messages
3. Files Partitions Plan invalidation checks Indices Protection views Transactions SQL OPERATIONS Buffering Select update subset records or blocks Check constraints FILE STRUCTURES B tree relative locking logging MEDIA MANAGEMENT space allocation duplexing main memory cache Figure 6a The division of labor within a compiled and executing Cobol plus SQL program The application calls the SQL executor which handles the SQL plans and subcontracts single variable queries to the files system and disc process The file system manages the physical schema The disc process manages subsets and integrity constraints on table fragments partitions See also figure 6b 29 APPLICATION PROCESS CODE USER WORKING STORAGE USER PROGRAM SQL PLANS EXECUTOR DATA BUFFERS DISK SERVERS single variable query project select FILE OPENS SQL EILE AND SET DISK veci MESSAGE EVALUATOR BUFFERS SYS BUFFERS LOCKS records or blocks ert TRANSACTION LOG and B TREES ga L C Figure 6b The structure of a compiled and executing program The application calls the SQL executor which in turn calls the file system The file system sends single variable query requests to disk processes The disk process does projections and selections on tables and protection views to produce a record subset This subset is returned to the file system and executor or is updated or
4. TANDEM SYSTEM pP i 3 Hardware Architecture s s i i i it 3 Operating System and Network 3 DataManagement ss L 4 Transaction Management 6 Why SQ aaa 7 NonStop SQL LANGUAGE FEATURES 8 IT cs a aN ARE SIC aA ot ORE NCAT 3 DTI Co TIS oO 8 Logical Names for Location Independence aL 8 Dicnonary and Caalogs ea een cce ne cesenerue slur utuy tu cowes 10 Unifying Logical and Physical DDL k 13 Logical Table Attributes FC C C O Oa 14 Physical lable Attributes C ee 14 Views oo 15 Data IMAI PULAU ON 2c conc onc crre ce cceed ce cote ch co cect 15 Transaction Management and Concurrency Control 17 Local Autonomy hK C aaa 19 Conversational Interface lt s i i itits 18 DProgrammanc SUL l 19 Integrating SQL With Host Language Compilers 21 Host Language Features 21 Integrating SQL Programs With Object Programs 22 Static and Dynamic Compilation 22 Run Time Statistics and Error Reporting L 23 IMPLEMENTATION t i iai 25 The SOL Compiler C 28 Subcontracting Single Variable Queries to Disk Processes 29 Sequential Block Buffering or Portals o e f f O 30 Compilation and Local Autonomy 30 Invalidating Compiled Statements kefOCCkf f Oaj 30 Table Opens YS Cursor Opens 32 NonStop Operation 33 PERFORMANCES ae cen oo oll ek cheer tin Pio ho Po mI Se 34 Single variable Query Processing Performances 34 Performance on the DebitCredit Benchmark sissies 34 Performa
5. When the database design changes or the table statistics change compiled plans become invalid NonStop SQL s approach to invalidation differs from other 33 systems in two ways First NonStop SQL invalidates plans whenever there is a chance that the change may affect the plan For example most SQL systems do not automatically recompile when an index is added This could confuse a user who added an index and noticed that it did not help performance at all NonStop SQL will automatically recompile the program in this case and in any other case where the plan may change NonStop SQL also differs in the way it invalidates programs Most SQL systems invalidate SQL compilations by clearing the valid bit of the plan in the catalog In such systems whenever a program executes a transaction it first reads the plan s valid bit from the catalog to assure that it is still valid and then proceeds If the bit is off the program automatically re SQL compiles itself NonStop SQL rejected this approach because the valid bit can become a bottleneck if every transaction must access it and because local autonomy implies that the database may change while some program is inaccessible So NonStop SQL maintains the program valid bit only as a hint of what programs need recompilation NonStop SQL adopted a timestamp scheme to invalidate compiled statements Associated with every table and view is a timestamp called the redefinition timestamp This timestamp i
6. between defines and synonyms can be summarized by N Both provide an aliasing mechanism for data definition and data manipulation N Defines are supported as a standard part of the Tandem system naming mechanism and so provide a general tool that works for all files tables devices and processes Synonyms are unique to the SQL environment Defines are a process attribute specified explicitly while synonyms are a user attribute and are stored in the catalogs non Defines require special syntax an preceding the name while synonyms fit within the SQL naming scheme Defines were supported because they are part of the Tandem naming system Synonyms will probably also be supported someday Dictionary and Catalogs The descriptive information representing the logical schema is kept in an SQL database called the dictionary Information about a table is replicated at every site having a fragment of the table so that the local parts of the table can be accessed even if the site is disconnected from the network This design rule called local autonomy implies that the dictionary be partitioned into catalogs each catalog acting as a local dictionary for tables at that site The transaction mechanism is used to protect updates to catalogs and so maintain the consistency among catalogs at different sites A simplified diagram of the catalogs is shown in Figure 3 A site can have a single catalog a catalog per project or applicat
7. deleted by the disk process 30 The file system manages the physical schema It handles opens of files and indices partitioning and sends requests to appropriate disk processes and buffers the replies When a table is updated the file system manages the updates to the base file and to all the secondary indices on the file When a record is retrieved via an index the file system first looks in the index and then in the base file This is because the index may well be on a disk separate from the base table so the disk process cannot do index maintenance in general If a retrieval can be entirely satisfied by the index the base table is not accessed simple selects Palermo s semi joins Palermo and minimum values are typical uses of this feature Each disk volume is managed by a set of disk processes These processes have a common request queue and a shared buffer pool which they coordinate via semaphores The disk processes implement file fragments They manage disk space access paths locks log records and a main memory buffer pool of recently used blocks A disk process operates on the single variable query scanning the database usually via the primary index to find records which satisfy the selection expression Once the records are found the disk process either operates on the records update expression or delete or the projected records are returned to the file system If the request is a very long one more than 10 ios t
8. in a specified national language and display them to a diagnostic stream or to a buffer 26 In addition the SQL executor returns detailed statistics in the SQL statistics area SQLSA These include a table by table count of records used records examined disk reads messages message bytes waits time outs escalations and busy times Designers can either instrument their programs by reading these counters or they can use the operating system s measurement facility Measure to read these SQL counters along with other system counters 27 IMPLEMENTATION The NonStop SQL system and its utilities are about 7 million lines of code implemented by up to 25 developers over 3 years This was the sixth SQL implementation for one developer Five others had done it at least once before Many of the developers had implemented other relational systems We tried not to repeat our earlier mistakes Implementing SQL itself took only a few people most of the effort went into integration with the rest of the system and into utilities The quality assurance performance assurance documentation marketing education and beta test efforts involved many more people in the product The system is integrated with the operating system When the system is up NonStop SQL is up One does not bring up an SQL database it is just there This contrasts with most other designs In addition because the operating system and SQL authorization are integrated ther
9. it is audited by default Updates to audited tables are automatically protected by locks belonging to the transaction and by undo and redo log records The user may declare a table to be non audited by altering the audit attribute Any program locking or updating an audited table must be part of a transaction must do a BEGIN WORK or be servicing a request from some process that did a BEGIN WORK As explained earlier the operating system message kernel manages the propagation of transaction identifiers as part of the requester server remote procedure call flow within the transaction The locking features include file generic and record granularities automatic escalation to coarser granularity implicit or explicit shared and exclusive lock modes three degrees of consistency selectable on a per statement basis and a 19 LOCK TABLE verb Deadlock detection is via timeout The default timeout is sixty seconds The CONTROL TABLE verb allows the user to specify the handling of lock waits The user may request bounce locks which never wait for a busy resource rather they return an exception In addition NonStop SQL supports read through locks BROWSE ACCESS which allow reading dirty data uncommitted updates The CONTROL TABLE command allows a program to alter these attributes on a table by table or statement by statement basis Non audited objects complicated the design since non transactional locking is explicit an
10. papers Astrahan the SQL DS manuals SQL DS the DB2 manuals DB2 and the ANSI SQL definition ANSI Extensions and variations were added to support distributed data high performance operational interfaces and integration with the Tandem system When development began the ANSI SQL standard ANSI did not exist Fortunately only minor changes were required to achieve close compliance with the ANSI standard Naming The first chore was to decide how naming directories and security should work Standard SQL naming has the form user table This is inadequate for a distributed system with local autonomy In such a system user names are qualified by site name and table names need to be more complex so that names may be created without consulting any other site Lindsay NonStop SQL names objects like any other objects in the Tandem system That naming convention is site process directory object These names are used to name tables indices views and programs Naming of columns follows the ANSI SQL conventions Integrity constraints are named by ANSI SQL identifiers so that diagnostic messages can explain which constraint is violated Considerable care was taken to make catalogs and naming automatic The Tandem default naming works for SQL objects Programs tables views and all other system objects are named in the same way The distinction between logical and physical schema is almost invisible automatic because table names an
11. table from site3 will be registered in a catalog at site3 The program s usage of the table will be recorded in the USAGES table at sitel and site3 The catalog tables shown in Figure 3 have the following semantics e TABLES has the name owner and security of each table or view e VIEWS has the view definition stored as a character string e COLUMNS has a description of each column of each table or view s BASETABS has information that applies to tables but not to views e CONSTRAINTS has the text of integrity constraints attached to tables s FILES describes the physical files that support tables and indices e PARTITIONS gives name and catalog of partitions of any table in this catalog e INDEXES describes an index file its key and whether it is unique e KEYS describes the permutation of the index columns to the base table columns e PROGRAMS describes programs which contain SQL It tells when each program was last compiled and whether the compilation has since been invalidated by changes to the database schema e COMMENTS stores user provided descriptive text about objects e USAGES is a concordance which shows who uses what It is used to generate where used reports and to direct program invalidation when a relevant table or view definition changes and to relate indices and views to their base tables An additional table not shown in the figure gives the current catalog s version number 12 TABLES name Sneh COLUMNS r
12. table may be declared to be a protection view A protection view inherits the organization indices and partitioning of its underlying table In addition protection views are updateable To simplify authorization only the table owner can define and own a protection view The accessor is authorized to the view Protection views are implemented by the disc servers and consequently really do provide protection one can grant access to the view without granting access to the underlying tables This differs from other implementations which place a security mechanism on top of the operating system protection mechanism or even those like R which implemented security in the client node s data manager Node autonomy requires that security be implemented at the node which owns the data There are no back doors to this protection mechanism The disc server knows about tables and views and there is no lower level access to the underlying files This is one of the benefits of integrating SQL with the operating system By contrast a shorthand view is simply a macro definition When the view name is invoked the view definition is substituted and the authorization and query optimization is as though the macro body had been entered directly So the user of a shorthand view must be authorized to its underlying protection views and tables Shorthand views are very general They allow combinations of tables and views via projection selection joins and agg
13. the first SQL system to offer partitioning of tables across multiple sites of a network with full transaction recovery For performance various file organizations are supported The standard file organization is a B tree based on the table s primary key Sophisticated users can specify a relative file directly addressable records or an entry sequence file insert only at the end records cannot grow once inserted If the user does not specify an organization or key when the table is created then the default is a key 16 sequenced table with the GMT julian timestamp of each insert used as the system generated key record key System generated keys for relative entry sequenced and default files appear in the table as a column named SYSKEY By default tables are covered by transaction protection logging and locking but this is a user option Each table has a default lock granularity which can be set or changed The default is record granularity but the designer can request generic locking on a key range As mentioned later these defaults can be overridden by a program or statement Views As the R designers pointed out implementing views in a distributed system with local autonomy is not straightforward The ideas of shorthand view and protection view were borrowed from the R project Williams On closer inspection only the names were borrowed In NonStop SQL any view which is a simple projection and selection of a single
14. tuning high performance applications Degree 3 consistency was sacrificed in one case to avoid hotspots When in serting at the end of an entry sequenced or relative file NonStop SQL locks the inserted record but does not lock the end of file This gives only degree 2 consistency Gray since insert order may differ from commit order unless a table lock is acquired Locking the end of the table file for the duration of a transaction is equivalent to a table lock for sequential files a well known bottleneck in high performance systems IMS FastPath solves this problem by deferring sequential inserts until transaction commit Gawlick The FastPath design has more complex semantics and also sacrifices degree 3 consistency since the records do not appear in the file when they are inserted Transaction Management and Concurrency Control Tandem s Encompass data management system provides a transaction mechanism that includes transaction rollback and distributed transactions Nonstop SQL is integrated with this transaction manager Borr A single transaction log audit trail is maintained at each site This log provides undo redo and media recovery for old Enscribe and new SQL data One transaction can contain both Enscribe and SQL calls and is recorded in a single log per site Having one transaction log per site eases system management when compared with the non integrated log per subsystem of other designs When a table is created
15. will support them in release 2 Defaults are implemented as specified in ANSI SQL Each type has a default value which the user can override Users may add integrity constraints to a table Constraints are named single variable queries The constraint is validated against the table when the constraint is first defined Thereafter any insert or update which violates the constraint will be rejected Constraints are enforced by the disk process file server This removes many integrity checks from the application program Since constraints are named and may be added or dropped at any time NonStop SQL s implementation is slightly more general than the ANSI SQL definition of lt check constraint gt ANSI The ANSI SQL numeric types of INTEGER NUMERIC and DECIMAL up to 64 bits are supported Fixed and variable length character strings are supported Records cannot span blocks and so are limited to 4K bytes in the initial release Indices involving multiple fields ascending and descending may be defined on tables Comments on all objects are supported Referential integrity will be implemented in a later release Physical Table Attributes Tables and indices can be horizontally partitioned in the network Each partition resides on a particular disk volume The partitioning criteria are user specified low key values New partitions can be added to the end of a table Although partitioning tables in a network is much discussed NonStop SQL is
16. AbortTransaction which undoes the audited updates of the transaction Borr The underlying mechanism provides transaction back out distributed transactions with a commit protocol implemented with a non blocking grouped two phase and presumed abort protocols Mohan Helland A single transaction log audit trail is maintained at each site This audit trail provides undo redo and media recovery for both old Enscribe and new SQL data The Tandem system tolerates any single fault without interrupting service If there is a double fault or an extended power outage the transaction recovery system recovers data by aborting all uncommitted transactions and redoing recently committed transactions In case of double media failure the transaction manager supports media recovery from an archive copy of the data and the transaction log by redoing recently committed transactions Archive dumps of the database can be captured while the database is being updated that is media recovery can work from a fuzzy dump Why SQL Perhaps the most controversial decision of the NonStop SQL project was to abandon compatibility with Tandem s existing data base products and adopt an SQL interface instead After all Encompass was the first commercial distributed database system It has many strong features and a loyal following In retrospect the choice of SQL seems less courageous since SQL has become the standard data management interface At the tim
17. Computers Inc Cupertino CA Dec 1986 NonStop SQL Introduction to NonStop SQL Part No 82317 Tandem Computers Inc Cupertino CA March 1987 NonStop SQL 1 NonStop SQL Benchmark Workbook Part No 84160 Tandem Computers Inc Cupertino CA March 1987 NonStop SQL 2 Group T P A Benchmark of NonStop SQL on the DebitCredit Transaction SIGMOD 88 ACM June 1988 Pathway Introduction to Pathway Part No 82339 Tandem Computers Inc Cupertino CA June 1985 Pathmaker Introduction to Pathmaker Part No 84070 Tandem Computers Inc Cupertino CA March 87 Palermo F Palermo A Database Search Problem Information Systems COINS IV J Tou ed Plenum 1974 47 Rowe L Rowe Database Portals A New Application Programming Interface VLDB Aug 1984 SafeGuard SafeGuard User s Manual Part No 82539 Tandem Computers Inc Cupertino CA Feb 1987 SQL DS SQL Data System Concepts and Facilities IBM Form No GH24 5013 Armonk NY Feb 1982 Teradata The Genesis of a Database Computer A Conversation with Jack Shemer and Phil Neches of Teradata Corporation IEEE Computer Nov 1984 Williams R Williams et al R An Overview of the Architecture IBM Research Report RJ3325 San Jose CA Dec 1981 48 49
18. FILE PASCAL COMPILER BINDER PASCAL OBJECT SQL SOURCE SQL COMPILER PASCAL OBJECT SQL SOURCE SQL OBJECT Figure 5B The Pascal C and Tal compilers directly support the NonStop SQL language without need for a preprocessor The consequent data flow is shown here in contrast to figure 5A i LA Host Language Features NonStop SQL s programmatic interface has many features to ease programming including comprehensive diagnostics imbedded in output listings ability to invoke data declarations of tables from the catalogs support for WHENEVER exception handling support for multiple levels of copy libraries and generation of trace information so that application programmers can track errors back to source language statements In addition it supports separate compilation That is a cursor may be defined in one compilation and used in another Program B can use cursor C of program A by referring to cursor A C 24 Integrating SQL Programs With Object Programs Tandem s implementation of program compilation is similar to the original System R implementation Astrahan An innovation lies in the binding of the SQL source and object programs with the Cobol85 or other host language object modules The resulting object program is a single object which can be moved copied secured archived or purged without having to manipulate one or more separate access modules By contrast most other SQL systems store the SQL progra
19. If on the other hand it is presented with host variables it assumes that a fraction of the domain will be selected The fraction is inversely proportional to table size This wild guess may lead to a poor plan For example consider the query SELECT customer FROM accounts WHERE account_number BETWEEN min AND max Suppose that there are a million accounts The optimizer will guess that this query will select 333 333 records If on the other hand it sees the query SELECT customer FROM accounts WHERE account_number BETWEEN 50000 anD 60000 then the optimizer will guess that 10 000 records will be selected This disparity may result in bad query execution plans To our knowledge all relational systems suffer from this illness This problem is serious for SQL programs almost all SQL statements will involve host language variables rather than literals It is trou blesome that SQL optimizers are unable to guess set cardinalities in this case There are three proposals not implemented to deal with this problem The first is to let the user give the compiler hints The second idea is to execute the program in training mode with sample pa rameters compile the statements on the fly and save the plans used for the presented literals as the plan to be used in production 43 A third scheme is to set a threshold for guesses If the compiler cannot find a cheap plan when host language variables are present it would simply not
20. NonStop SQL A Distributed High Performance High Availability Implementation of SQL The Tandem Database Group April 1987 revised July 1988 Abstract NonStop SQL is an implementation of ANSI SQL on Tandem Computer Systems It provides distributed data and distributed execution It can run on small computers and has been benchmarked at over 200 transactions per second on a large system Hence it is useable in both the information center and in production environments NonStop SQL provides high availability through a combination of NonStop device support and transaction mechanisms The combination of SQL semantics and a message based distributed operating system gives a surprising result the message savings of a relational interface pay for the extra semantics of the SQL language when compared to record at a time interfaces This paper presents the system s design rational and contrasts it to previous research prototypes and to other SQL implementations The following is a trademark of Bell Telephone Laboratories Incorporated Unix The following are trademarks or service marks of International Business Machines Incorporated CICS DB2 and SQL DS The following are trademarks or service marks of Tandem Computers Incorporated Encompass Enform Enscribe FastSort Guardian NonStop NonStop SQL Pathway Pathmaker SafeGuard and Tal TABLE OF CONTENTS INTRODUCTION ote oat ta oh ee Be he ete Oa te Pte Fate aa Pe aa 1 AN OVERVIEW OF THE
21. SOCAN name tablename protected colnumber withcheck colname text datatype statistics createtime recompiletime BASETABS INDEXES tablename filename tablename filename filetype indexname rowcount blocksize filename rowsize extentsizes column count end of file unique CONSTRAINTS PARTITIONS tablename filename indexname constraint partitionname key field no name catalogname table field no text startkey ordering PROGRAMS COMMENTS USAGES programname objectname used name security subname used catalog createtime type using name recompiletime text using catalog valid type of use Figure 3 A schematic of the catalog tables 13 Unifying Logical and Physical DDL The SQL language integrates data definition and data manipulation in one language This is a nice simplification when compared to the DBTG separation of data definition from data manipulation DBTG NonStop SQL continues this simplification by merging the definition of logical schema tables with the definition of physical schema files It hides the logical physical distinction from the user but the underlying implementation makes a clear distinction between logical and physical attributes To give a specific example the definition of the EMP table created in the previous section combined logical design information primary keys columns with some physical design information in that case partitioning of the table into two physical files All this informat
22. a highly parallel SQL implementation In addition SQL is a natural base for data sharing among heterogeneous systems because most systems support SQL It is also an excellent base for end user and fourth generation languages AN OVERVIEW OF THE TANDEM SYSTEM Tandem builds a single fault tolerant distributed system for on line transaction processing The system can be grown in small increments by adding processors and disks to a cluster and clusters to a network Hardware Architecture The Tandem hardware architecture consists of conventional processors each with up to 16MB of main memory and a 5MB sec burst multiplexor io channel The processors do not share memory Dual 20MB sec local networks can connect up to 16 processors A fiber optic local network allows four plexed 1MB sec interconnect of up to 224 processors see Figure 1 A variety of long haul public network protocols can be used to connect up to 4000 processors Communication and disk device controllers are dual ported and attach to two processors so that there are dual paths to the storage and communication media Disk modules come in two styles low cost per actuator and low cost per megabyte These modules are packaged 2 4 6 or 8 to a cabinet Typically each disk is duplexed so that media and electronic failures of a disk will not interrupt service Operating System and Network Processes are the software analog of hardware modules They are the unit of functiona
23. al prefetch and postwrite on demand or if it detects three sequential accesses in a row by a client In sequential mode the disc process does two things it moves data to and from disc in large transfers typically 30KB to 60KB and it ages used sequential pages quickly to prevent sequential flooding of the cache Similar bulk IO logic is included in the cache flush logic needed for quick system restart from disc Main memory databases can be written to disc with a few large transfers rather than many small ones The use of bulk io reduces the number of IOs cpu consumption and controller consumption It allows batch sequential and random online transactions to coexist Borr 1 When sequential scanning is moved to the client CPU demand rises by a factor of three and the throughput drops by the same factor of three due to message traffic If sequential buffering is not used another factor of three is typically lost due to the increased message traffic Performance on the DebitCredit Benchmark We expected the benefits of remote execution of single variable queries for set operations but did not anticipate the benefits for single record operations The original goal was to match the performance of the Cobol record at a time interface to within 25 when executing the DebitCredit transaction a simple on line transaction processing application which updates three tables by key and inserts a record in a fourth table Anon But when the applicatio
24. called DATE in SQL Rather each field must be individually named by the program and moved by SQL Typical applications have 100 such fields per record on average and 1000 fields per record is not unusual An INSERT Or FETCH statement must specify all 1000 values in the correct order for such a table This is error prone and inconvenient As defined in the ANSI standard the SQL executor must move each field to and from individual host language variables So if the programmer wants to insert or fetch a record with 1000 fields the field move logic is exercised 1000 times for each record select or insert No matter how well that logic is optimized it will have a difficult time competing with the record at a time interface which treats the entire record as a string of bytes and moves it with a single instruction The solution to this is to provide group support in the SQL language 42 A related issue is comparing groups of fields In most SQL systems if one wants to do a lexical compare of several fields the syntax is a a AND b b AND c c OR a a AND b gt b ORa gt a by introducing tuple variables one gets the syntax a b c 2 a b c This construct is surprisingly common fortunately this new syntax is now a standard part of SQL2 Parameters at Compile time If the SQL optimizer is presented with literals in a selection statement it can generally make a reasonable estimate of the selectivity of the predicate
25. compile at static compilation time At run time the statement would be compiled dynamically on first invocation with the literals bound in for the host variables This is similar to the first scheme but is more automatic Update Statistics The SQL compiler picks execution plans based on estimated table sizes record sizes index selectivity and for each field the number of unique values and the minimum and maximum values All these indicators are collected by a utility SQL command called UPDATE STATISTICS For the 20GB database used in the DebitCredit benchmark the first implementation of this utility would have taken about a day to collect all the statistics on the various tables Since NonStop SQL is intended for databases far in excess of 20GB something had to be done UPDATE STATISTICS now samples the database and estimates the statistics Now its maximum running time is a few minutes A complete scan is done for small tables lt 4MB For a 20GB table a 64 000 bit hash table is allocated for each field and 32 000 records are examined The values of each record s fields are hashed into its hash table and the min and max value of each field is maintained After all these records have been examined the minimum value maximum value and number of unique values are estimated This algorithm is in the spirit of Sample Counting described by Astrahan 2 44 SUMMARY Virtually every commercial compute
26. d has complex semantics In particular the state of locks and cursors after a COMMIT WORK verb is difficult to explain NonStop SQL defined some rules which seem consistent A FREE RESOURCES verb was introduced to free locks and cursors on non audited tables The COMMIT WORK and ROLLBACK WORK verbs implicitly issue a FREE RESOURCES verb For simplicity design experts recommend against using non audited tables for anything but temporary results Non audited tables are useful for batch applications using the old master new master recovery protocol A table may be made temporarily un audited during a batch run Non audited tables are used as scratch files for sort and the query evaluator In addition to reduce log activity and the consequent large log files index creation and table load run non audited and then alter the table to be audited Local Autonomy Local autonomy requires that NonStop SQL offer access to local data even if part of it is unavailable and even if the site is isolated from the rest of the network For compiled SQL plans this means that the SQL compiler must automatically pick a new plan if a chosen access path i e index becomes unavailable This recompilation is automatic and transparent to the user Data definition operations pose more difficult problems If a table is partitioned among several nodes of a network then dropping the table requires work at each of those nodes both updates to the catalogs and
27. d their corresponding file names are the same The goal was that most SQL examples from Date Date should work without change when entered from a terminal Having one naming convention for the whole Tandem system simplifies learning and operating the system Logical Names for Location Independence System administrators and application designers need the ability to bind a program to new tables without altering the source code In production systems a program is typically created and tested in a test environment and then moved to a production environment In a distributed system a program may be duplicated at many different sites Running a report against many instances of a generic table is another common situation With most SQL systems each of these situations 9 require editing the program and changing the table names to reference the desired tables Some SQL systems offer synonyms a user specific name mapping of the form CREATE SYNONYM programmer emp FOR test emp so that the name programmer emp will map to test emp When the program is moved to production the synonym is changed to DROP SYNONYM programer emp CREATE SYNONYM programer emp FOR production emp As an alternative NonStop SQL offers logical names called defines which allow users to rebind a program s table names at SQL compile or run time without altering the source program Defines are similar in function to OS 360 DD cards TSO ALLOCATEs and CMS FILEDEFs The
28. deletion of the files Changing table attributes has similar requirements In general NonStop SQL requires that all nodes related to a table participate in the DDL operation If any relevant node catalog or disk process is unavailable then the DDL operation is denied This violates local autonomy There are no attractive alternatives here NonStop SQL takes the following approach The local owner can DISCONNECT a table to break all linkages of the table with unavailable sites This is a unilateral operation 20 Thereafter the user can operate on the table without complaint At a later time the user can run a utility which attempts to CONNECT the pieces of a disconnected table back together Conversational Interface Almost all SQL systems come with a program which executes any SQL statement entered from a terminal This is an easy program to write since most SQL implementations provide dynamic SQL a facility to PREPARE DESCRIBE and EXECUTE any SQL statement much like EVAL in LISP Systems differ in the bells and whistles they provide with this basic facility The NonStop SQL conversational interface SQLCI includes extensive documentation via on line help text online diagnostic messages and an iterative report writer The NonStop SQL manuals are written so that they may be accessed by the con versational interface The user can ask for help on any topic in the manual or for explanation of an error message For examp
29. e the rationale for adopting SQL was that the Encompass dictionary is passive and proprietary Encompass provided a record at a time interface for programmers and little data independence Like most such systems it was built on top of the file and security system rather than being integrated with it Customers were asking for an integrated and active dictionary one which had no back doors and one which assured consistency between the dictionary and applications In addition customers were asking for support of views and integrity constraints SQL provides views and a standard data definition and manipulation language In addition NonStop SQL provides an active distributed and integrated dictionary Tandem is gradually evolving its Encompass application development environment to a dictionary based on SQL After settling on SQL the build vs buy decision had to be made Several software houses were willing to port their SQL system to Tandem hardware This alternative offered a low cost and low risk solution It also offered low benefit Tandem wanted SQL to be integrated fault tolerant high performance and distributed So a second courageous decision was made to s