Knowledge Discovery and Ontologies (KDO-2004)

Workshop at ECML/PKDD 2004

15th European Conference on Machine Learning (ECML)

8th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD)

Pisa, Italy

September 24, 2004


With Support From








Invited Speakers



Gholamreza Nakhaeizadeh - DaimlerChrysler, Ulm, Germany

From Data Mining to Text Mining. Is consideration of background knowledge in the mining process possible?


In this talk we will give an overview about application of Data and Text Mining in industry and business and discuss the problem of consideration of background knowledge in the mining process from a practical point of view.



Jürgen Angele - Ontoprise, Karlsruhe, Germany

Ontology-Based Query and Answering in Chemistry: OntoNova @ Project Halo
The Project Halo has the long-term objective of developing a Digital Aristotle, i.e. a knowledge system that is able to answer questions in a particular domain and give explanations for its answers. In this talk it is reported about the Ontoprise contribution to the Halo Pilot Project, in which various competing ontology engineering methodologies and knowledge system capabilities have been investigated. Concerning the first, I'll describe how we dealt with engineering a significant set of laws from chemistry that we had to let interact at different levels of generality and in varying orders. With regard to the latter, I'll report on the ability of our system to produce coherent and concise explanations of its reasoning. The importance of these two aspects can hardly be underestimated in the Semantic Web, as with future growth the interaction of large sets of laws will require dedicated management as well as the ability to let the user explore the trustworthiness of the ontology and the underlying data sources.








Welcome by Workshop Chairs



Invited talk

Gholamreza Nakhaeizadeh: From Data Mining to Text Mining. Is the consideration of the background knowledge in mining process possible?










Session: Using ontologies for data mining

·        N. S. Hoa, N. H. Son: Rough Set Approach to Approximation of Concepts from Taxonomy

·        H. Češpivová, J. Rauch, V. Svátek, M. Kejkula, M. Tomečková: Roles of Medical Ontology in Association Mining CRISP-DM Cycle

·        L. Brisson, M. Collard, K. Le Brigand, P. Barbry: KTA: A Framework for Integrating Expert Knowledge and Experiment Memory in Transcriptome Analysis

·        M. G. Ceruti: Ontology for Level-One Sensor Fusion and Knowledge Discovery



Poster session

·        I. Astrova, B. Stantic: Reverse Engineering of Relational Databases to Ontologies: An Approach Based on an Analysis of HTML Forms

·        M. Baglioni, M. Nanni, E. Giovannetti: Mining Literary Texts by Using Domain Ontologies

·        M. Cannataro, P. H. Guzzi, T. Mazza, P. Veltri: Using Ontologies in Proteus for Modeling Data Mining Analysis of Proteomics Experiments

·        T. Euler, M. Scholz: Using Ontologies in a KDD Workbench

·        A.-M. Giuglea, A. Moschitti: Knowledge Discovering using FrameNet, VerbNet and PropBank

·        L. Karoui, M.-A. Aufaure, N. Bennacer: Ontology Discovery from Web Pages: Application to Tourism

·        F. A. Lisi, F. Esposito: An ILP Approach to Semantic Web Mining

·        Z. Nazeri, E. Bloedorn: Exploiting Available Domain Knowledge to Improve Mining Aviation Safety and Network Security Data






Invited talk

Jürgen Angele: Project Halo







Session: Using ontologies for text/web mining

·        M. Vanzin, K. Becker: Exploiting Knowledge Representation for Pattern Interpretation

·        J. Leskovec, M. Grobelnik, N. Milic-Frayling: Learning Sub-structures of Document Semantic Graphs for Document Summarization

·        M. Labský, V. Svátek, O. Šváb: Types and Roles of Ontologies in Web Information Extraction


Coffee break






Session: Data/text mining for ontology development

·        T. T. Quan, S. C. Hui, T. H. Cao: FOGA: A Fuzzy Ontology Generation Framework for Scholarly Semantic Web

·        G. Paaß, J. Kindermann, E. Leopold: Learning Prototype Ontologies by Hierachical Latent Semantic Analysis






Session: KDD and ontologies ­– closing the loop

·        D.-K. Kang, A. Silvescu, J. Zhang, V. Honavar: Generation of Attribute Value Taxonomies from Data and Their Use in Data-Driven Construction of Accurate and Compact Naive Bayes Classifiers

·        S. Legrand, J. R. G. Pulido: A Hybrid Approach to Word Sense Disambiguation: Neural Clustering with Class Labeling


Wrap-up discussion



Topic and Motivation



The workshop is concerned with the interaction between prior knowledge as encoded in ontologies and derived knowledge as obtained by a knowledge discovery process.


The use of prior knowledge may significantly enhance knowledge discovery from large datasets or text collections. Currently, in most KDD projects, prior knowledge is only present implicitly (in the head of the human analyst) or in the form of textual documentation. Even in knowledge-intensive approaches such as ILP, the background knowledge is often not organized around a well-formed conceptual model. This practice seems to ignore latest developments in knowledge engineering, where domain knowledge is typically defined by formal ontologies.


Ontologies are formal, explicit specifications of shared conceptualizations of a given domain of discourse. The expected central role of ontologies in the organization and functioning of the Semantic Web has been well documented in recent years. Somewhat less traditional is the role of ontologies in incremental approaches to knowledge discovery, in which ontologies and machine learning methods are used in combination to mine, interpret and (re-)organise knowledge.


While knowledge engineering research has already recognised the value of knowledge discovery from textual and semi-structured resources in the process of building an ontology (i.e. in ontology learning), links in the opposite direction are more rare. Within the context of this workshop we intend to bring together researchers from both directions in order to initiate a discussion on how to integrate insights from both communities.



Areas of Interest


Submissions are invited on all aspects of the interaction between Knowledge Discovery (KDD) and Ontologies (Onto):


*  Onto4KDD


domain ontologies for extending source data and/or to constrain the search for hypotheses


ontologies of the knowledge discovery domain to recommend and/or configure methods or tools for a given problem

*  KDD4Onto


knowledge discovery (from databases, semi-structured data, text, multimedia) in ontology learning (extension/evolution) and/or ontology mapping (merging/integration)

*  Onto4KDD & KDD4Onto


combinations of knowledge discovery and ontologies to cover the full knowledge life cycle, i.e. the use of ontologies in incremental knowledge discovery experiments (based on previous results)

*  Usability & Evaluation


practical applications in industry of (the interaction of) knowledge discovery and ontologies


evaluation methodologies for (the interaction of) knowledge discovery and ontologies


Organizing Committee


Paul Buitelaar – DFKI, Saarbrücken, Germany

Jürgen Franke – DaimlerChrysler, Ulm, Germany

Marko Grobelnik – Jozef Stefan Inst., Ljubljana, Slovenia

Gerhard Paaß – Fraunhofer AIS, St. Augustin, Germany

Vojtěch Svátek – Univ. of Economics, Prague, Czech Republic (contact person)


Program Committee


Eneko Agirre, Univ. of the Basque Country, Spain

Nathalie Aussenac-Gilles, IRIT, Toulouse, France

Abraham Bernstein, Univ. of Zürich, Switzerland

Paul Buitelaar, DFKI, Saarbrücken, Germany

Mario Cannataro, Univ. of Catanzaro, Italy

Walter Daelemans, Univ. Antwerpen, Belgium

Thierry Declerck, DFKI & Univ. Saarbrücken, Germany

Jürgen Franke, DaimlerChrysler, Ulm, Germany

Aldo Gangemi, ISTC Roma, Italy

Marko Grobelnik, Jozef Stefan Inst., Ljubljana, Slovenia

Siegfried Handschuh, University of Karlsruhe, Germany

Andreas Hotho, University of Karlsruhe, Germany

Eduard Hovy, USC/ISI, Marina del Rey CA, USA

Mike Jackson, University of Central England, UK

Alipio Jorge, University of Porto, Portugal

Vipul Kashyap, Partners HealthCare Systems, Wellesley MA, USA

Jörg Kindermann, FHG St. Augustin, Germany

Nada Lavrac, Jozef Stefan Institute, Slovenia

Bernardo Magnini, ITC-IRST, Trento, Italy

Dunja Mladenic, Jozef Stefan Institute, Ljubljana

Reza Nakhaeizadeh, DaimlerChrysler, Ulm, Germany

Claire Nedellec, MIG-INRA, Jouy-en-Josas, France

Gerhard Paaß, Fraunhofer AIS, St. Augustin, Germany

Georgios Paliouras, NCSR “Demokritos”, Athens, Greece

Jan Rauch, University of Economics, Prague, Czech Republic

Massimo Ruffolo, ICAR-CNR & EXEURA, Italy

Tobias Scheffer, Humboldt-Univ. Berlin, Germany

Michael Sintek, DFKI, Kaiserslautern, Germany

Derek Sleeman, University of Aberdeen, UK

Steffen Staab, Ontoprise & Univ. of Karlsruhe, Germany

Gerd Stumme, Univ. of Karlsruhe, Germany

York Sure, Univ. of Karlsruhe, Germany

Vojtěch Svátek, Univ. of Economics, Prague, Czech Republic

Domenico Talia, University of Calabria, Italy

Brendan Tierney, Dublin Institute of Technology, Ireland

Ljupco Todorovski, Jozef Stefan Institute, Slovenia

Paola Velardi, Univ. of Roma "La Sapienza", Italy


Workshop Attendance and Registration


All workshop participants must register for ECML/PKDD 2004