Technical Status Report - 97Q4 Ian Davis University of Waterloo Progress Report In the period January 1st - March 31 1997 our primary activities involved supporting our public World Wide Web beta trial, and documenting the results of our research. Achievements In the past quarter we have focused on documenting and showcasing all aspects of our research. We meet with Industry Canada in Ottawa for a final review of our activities and the deliverables resulting from them. We began work on a paper titled "Experience with a Text/Relational Database Management System" and contributed to a paper titled "HQP: la gestion et l'integration des donnees relationnelles et textuelles" produced by Grafnetix which has been accepted for publication in the "Expertise Informatique" journal. In addition we prepared a lengthy article for the final CSSC newsletter. We developed a version of our web demonstration that could be installed and accessed on a P.C., without the use of our software. This stand-alone demonstration provides a generous sampling of our web pages. This met our CSSC deliverable for an archive version of our web demonstration. We continued to devote effort to improving our online demonstration. Work has begun on marking up the 1997 University of Waterloo calendar, and on expanding our TPCD benchmarks so that they operated on 1 gigabyte of relational data as required by the official TPCD benchmark. During the last three months effort has been made to document all software interfaces used within our prototype software so that both source and object code remain viable following completion of this project. Problem review When we increased the total size of our TPCD text files from 10 megabytes to 1 gigabyte we were unable to effectively index the resulting text. The text indexing software imposes limitations on the maximum number of distinct words which it expects to encounter in a text document, which was rapidly exceeded by the generation of random word values within the benchmark data generation software. The inability to build word list indices on large relational texts may significantly degrade performance, and suggests that that earlier reported results of encoding large volumes of relational data within text, may not scale appropriately when applied to very large relational tables, without some improvement in how words are indexed by the underlying text engine technology being used. Adding appropriate markup to the 1997 University of Waterloo calendar is proving challenging. Considerable effort was made to correct basic HTML errors in the calendar, so that it could be validated by a conventional SGML parser. Then considerable effort had to be made reworking software which automated addition of descriptive markup in the 1996 calendar, so that the correct descriptive markup is added to the 1997 calendar. This effort is needed because significant changes have been made to the format of the 1997 calendar. Using heuristics to add descriptive markup to texts that are periodically updated is fraught with difficulty, and should be avoided if at all possible. Coming events plan Since the CSSC project is now complete, no future CSSC related activities are planned. Project Status 110 Model Requirements [A] 94May01 94Sep30 100% 110 Model Requirements [B] 94Oct01 95Jan30 100% 120 DDL Design [A] 94Jan01 94Dec30 100% 120 DDL Design [B] 95Feb01 95Feb30 100% 120 DDL Design [C] 95Mar01 94Mar30 100% 131 DDL Interface Validator [A] 94Oct01 95Jan30 100% 131 DDL Interface Validator [B] 95Feb01 95Mar30 100% 140 DDL Specification [A] 95Jul01 95Dec30 100% 140 DDL Specification [B] 96Dec01 96Jun30 100% 150 DML Design [A] 94Jan01 94Jun30 100% 150 DML Design [B] 94Jul01 94Sep30 100% 150 DML Design [C] 94Oct01 95Jan30 100% 161 DML Interface Validator 95Jan01 95Jun30 100% 170 DML Specification [A] 95Jul01 95Dec30 100% 170 DML Specification [B] 96Jan01 96Jun30 100% 180 API Design 95Jan01 95Dec30 100% 190 API Specification 96Jan01 96Jun30 100% 310 HQP Requirements [A] 94Oct01 95Jan01 100% 310 HQP Requirements [B] 95Jan01 95Mar30 100% 320 HQP Design [A] 94Jan01 94Dec30 100% 320 HQP Design [B] 95Jan01 95Mar30 100% 331 HQP Prototype [A] 94Jan01 94Jun30 100% 331 HQP Prototype [B] 94Jan01 95Jun30 100% 331 HQP Prototype [C] 95Jul01 96Jan31 100% 331 HQP Prototype [D] 96Feb01 96Jun30 100% 341 Agent/Oracle Integration 95Apr01 95Sep30 100% 341 Agent/DB2 Integration 95Apr01 95Sep30 100% 341 Agent/Fulcrum Integration 95Oct01 96Jun30 100% 341 Agent/Pat Integration 95Oct01 96Jun30 100% 515 WWW applications for Beta Trial 96Apr01 96Jul01 100% 523 Web Site CGI Gateway to HQP 95Sep30 96Sep30 100% 700 Testing 95Oct01 96Sep30 100% 800 Beta Trial 96Sep30 97Mar31 100% 900 Project Coordination 93Apr01 97Mar31 100% Ian Davis. April 2, 1997