Friday, November 6, 2009

DBMS research publication of software

Secondo Plugins: A Platform to Publish Research Implementations A significant part of research in databases concerns the development of new data structures or algorithms, e.g. for indexing or query processing. Whereas the papers describing these structures are easily accessible after publication, the software developed and used in the experiments is all too often lost. This is unfortunate for several reasons. Authors of later papers who try to offer an improved solution need to reimplement the previous software - a process that is not only a waste of effort but also error-prone. What is compared to may not be the exact proposal by the original authors. Second, readers of the paper cannot repeat the experiments or do other experiments. Third, the software cannot be used in a system context. We offer a platform to publish such software together with the paper describing it, as a so-called Secondo Plugin. Secondo is an extensible DBMS prototype developed for many years at University of Hagen. It is extensible at all levels including kernel system, query optimizer and graphical user interface. In particular, the kernel is extensible by so-called algebra modules which offer collections of data types and operations. Secondo provides fairly sophisticated management of spatial data and moving objects. An index structure or query processing algorithm developed in research may be offered within a new algebra module as a type or operator, respectively. Operators can be called directly at a query interface below the optimizer. This allows one to easily test indexes or algorithms without the need to circumvent the query optimizer (i.e. before integration into query optimization has been achieved). The recently added feature of Secondo plugins enables anyone to publish a Secondo extension, independently from the Secondo team. Essentially, extension components need to be wrapped into a zip file. This includes an XML file describing which extensions have to be put where within a standard Secondo distribution. The plugin may then be published together with the paper, e.g. on the authors’ web site. A reader of the paper may get a Secondo system from University of Hagen and the plugin from the authors’ web site. A small installation program integrates the plugin into the system. We invite you to publish your new research implementations in this way. Advantages are the following. - Readers of your paper will be able to repeat your experiments, especially if you provide Secondo scripts (files with Secondo commands) executing related queries. They may also do other experiments that you did not think of, or use other data sets. - Authors of later papers will compare to your algorithm rather than others (when having a choice), because this is easy to do. They will compare to your correct implementation, not to what they misunderstood you implemented. - This in turn will lead to more citations of your paper. - Your software will go into a pool of query processing software and hence become more popular than otherwise. It may even be used for practical applications. See the Secondo system at http://dna.fernuni-hagen.de/Secondo.html/index.html. The Secondo Plugin concept is explained via that site or directly at http://dna.fernuni-hagen.de/Secondo.html/start_content_plugins.html At the site some plugins are shown that are available already. When you publish a plugin, you are welcome to notify us so that we can include a link on our web site. Ralf Hartmut Güting and the Secondo team University of Hagen http://dna.fernuni-hagen.de/gueting/home.html

Labels: , ,