Abstract
Inductive Logic Programming (ILP) is a very vast domain which comes under the umbrella of machine learning. Our research work is mainly focused on Information Extraction (IE) using ILP, which requires large amount of data to train the system. ILP typically deals with highly unbalanced data; in our case the negative examples are far greater than the positive examples. IE is the process of finding facts in unstructured text, such as biomedical journals, and putting those facts in an organized system. In particular, we have focused on learning to recognize instances of the protein-localization relationship in Medline abstracts. We view the problem as a machine-learning task: given positive and negative extractions from a training corpus of abstracts, learn a logical theory. The system must correctly identify the protein and its location, given a new data set. A common way to measure performance in these domains is to use precision and recall instead of simply using accuracy. The current levels of precision-recall values are low and our goal would be to improve these values.
Introduction
ILP is a research area at the intersection of machine learning and logic programming. It is generally known that induction is reasoning from specific to general. In the case of inductive learning from examples, the learner is given some examples from which general rules or a theory underlying the examples are derived. Inductive learning has been successfully applied to a variety of classification and prediction problems. ILP is a subfield of machine learning which uses logic programming as a uniform representation for examples, background knowledge and hypothesis. Given an encoding of the known background knowledge and a set of positive and negative examples represented as a logical database of facts, an ILP system will derive a hypothesized logic program which entails all the positive and none of the negative examples.
Post a Comment