Technical Status Report - 95Q3 Ian Davis University of Waterloo Progress Report In the period October 1st - December 30th 1994 our primary focus was on development of a data model for text (110), design of the DML (150), improvement of the paper entitled 'Text/Relational Database Management Systems: Harmonizing SQL and SGML.' (120 & 150), and software development which will lead to the creation of a hybrid query processor (320, 331, 341). We were actively involved in the Hybrid Query Processor Working Group meetings which occurred October 25th and November 29th, and distributed minutes of these two meetings to CSSC. At the meeting of October 25th it was agreed that the data model requirements statement (110) that we had earlier circulated could be distributed to CSSC as a final document. Software which validated SQL'92 DML queries incorporating our proposed extensions to accomodate text (150) was also distributed as source to all members of CSSC during this meeting. After some initial problems in porting this source code (which we addressed) Fulcrum reported that they were able to successfully compile and use this software. Prior to the meeting of November 29th we received input from several members of the HQP Working Group on application requirements of the HQP and distributed a data model statement (110) for review by other members of the HQP working group. Considerable progress was made at the HQP Working Group Meetings in recognising that while our earlier DML proposals were sound, they did not go far enough in addressing the problems raised by TEXT. It was established that our proposal to require that all TEXT in a single relational column conformed to a single grammar was too restrictive, and that in general grammars should be associated with instances of text rather than with relational columns. It was recognised that subtext needed not only to be extracted from its context but also to be capable of being identified within some larger context. It was agreed that in addition to using the HQP to manipulate TEXT, applications had a legitimate need to directly manipulate and navigate within TEXT, and that therefore TEXT should be transmitted across the API as TEXT rather than as standard relational values. We are preparing a HQP Requirements statement, a HQP Formal Design document, and are extending our earlier proposed data model for text to include a definition of the operations which we wish this model to support. We are also working on improving the DDL and DML design described in our evolving SQL/SGML paper. We are extending the language accepted by our SQL'92 parser to include support for conventional DDL (130,131), and have began converting parsed SQL statements into operational structures which better define the operations to be performed by the HQP (160). We are concurrently developed software which will perform view substitution (which is needed to support the concept of a hybrid relational tables), view merging, optimisation of boolean equalities to allow the underlying engines to perform joins when possible, decomposition of operational structures into operations that can be handled by the underlying text and relational engines, and recomposition of appropriate parts of these operational structures into SQL commands that can be transmitted across the ODBC interfaces (320,330). We are also developing software which translates the resulting HQP operations into an execution plan. We have enhanced the SQL engine earlier developed so that it can both efficiently buffer the results of subqueries, and recompute these buffered results whenever the values on which they depend change (320,330). This engine has also been enhanced to support the various types of join defined in SQL'92. We have also extended in various ways the HQP catalog/dictionary. Achievements The major progress in our (100) activities has been the acceptance of our data model requirement statement, the development of a data model for text, and refinement of many of our earlier proposed ideas. The major progress in our (300) activities has been the development of much of the internal software necessary to facilitate the translation and execution of HQP statements expressed using SQL'92. Problem review The HQP Working Group first met September 28th, and this left only three months to complete the design of the DDL and DML. Concern was raised in our last status report about our collectively ability to complete this exercise in so short a period of time. This concern has been echoed by other members and documented in the minutes of each of our subsequent meetings. We have (as a Working Group) indicated that the design exercise will not be completed by December 31st 1994 as originally planned. We intend to provide the HQP Working Group with a draft of a data model statement that describes both the proposed data model for text, and the operations that can be performed on this model before our next meeting (110). We are also planning to released a revised document proposing DDL (120) and DML (170) extensions prior to this meeting. Some of the issues raised by the HQP Working Group regarding the representation of TEXT pertain to the API. There is a need to begin focusing on the API used both to retrieve and to navigate within text. Coming events plan Weekly meetings of the research group within the University will continue throughout the next period. The next meeting of the HQP WG will be on Tuesday January 24th, 1995. We will meet from 9:30 am - 4 pm. InContext will host this meeting in Toronto. Project Status 6 110 Model Requirements Statement [A] 94May01 94Sep30100% 7 110 Model Requirements Statement [B] 94Oct01 95Jan3050% 18 120 DDL Design [A] 94Jan01 94Dec30 75% 19 120 DDL Design [B] 95Feb01 95Feb30 5% 20 120 DDL Design [C] 95Mar01 94Mar30 0% 22 131 DDL Interface Validator [A] 94Oct01 95Jan30 25% 23 131 DDL Interface Validator [B] 95Feb01 95Mar30 0% 33 140 DDL Specification [A] 95Jul01 95Dec30 0% 34 140 DDL Specification [B] 96Dec01 96Jun30 0% 42 150 DML Design [A] 94Jan01 94Jun30 100% 43 150 DML Design [B] 94Jul01 94Sep30 100% 44 150 DML Design finalised 94Oct01 95Jan30 50% 46 161 DML Interface Validator [A] 95Jan01 95Jun30 100% 47 161 DML Interface Validator [B] 95Jul01 96Jan31 0% 48 161 DML Interface Validator [C] 96Feb01 96Jun30 0% 49 161 DML Interface Validator [D] 96Jul01 96Dec30 0% 59 170 DML Specification 95Jul01 95Dec30 0% 60 170 DML Specification 96Jan01 96Jun30 0% 66 180 API Design [A] 94Oct01 94Dec30 100% 67 180 API Design [B] 95Jan01 95Sep30 0% 68 180 API Design [C] 95Oct01 95Dec30 0% 74 190 API Specification [A] 96Jan01 96Jun30 0% 75 190 API Specification [B] 96Jul01 96Dec30 0% 125 310 HQP Requirements Statement [A]94Oct01 95Jan01 50% 126 310 HQP Requirements Statement [B]95Jan01 95Mar30 0% 132 320 HQP Design [A] 94Jan01 94Dec30 75% 133 320 HQP Design [B] 95Jan01 95Mar30 0% 135 330 HQP Prototype [A] 94Jan01 94Jun30 100% 136 330 HQP Prototype [B] 94Jan01 95Jun30 40% 137 330 HQP Prototype [C] 95Jul01 96Jan31 0% 138 330 HQP Prototype [D] 96Feb01 96Jun30 0% 146 341 HQP/Agent Integration 94Sep01 95Jun30 40% 146 341 Agent/Oracle Integration 95Apr01 95Sep30 15% 147 341 Agent/Fulcrum Integration 95Oct01 96Jun30 15% 148 341 Agent/Pat Integration 95Oct01 96Jun30 5% 900 Project Coordination 93Apr01 97Mar31 25% Ian Davis. December 20th, 1994