Omphalos Session: Design and Results of the GI Competition

Omphalos Session

Omphalos Session Programme • Design & Results 25 mins • Award Ceremony 5 mins • Presentation by Alexander Clark 25 mins • Presentation by Georgios Petasis 10 mins • Open Discussion on Omphalos and GI competition 20 mins

Omphalos : Design and Results Brad Starkie François Coste Menno van Zaanen

Contents • Design of the Competition • A complexity measure for GI • Results • Conclusions

Aims • Promote new and better GI algorithms • A forum to compare GI algorithms • Provide an indicative measure of current state-of-the-art.

Design Issues • Format of training data • Method of evaluation • Complexity of tasks

Training Data • Plain Text or Structured Data • Bracketed, Partially bracketed, Labelled, Unlabelled • (+ve and –ve data) or (+ve data only) Plain Text,(+ve and –ve) and (+ve only) • Similar to Abbadingo • Placed fewest restrictions on competitors

Method of Evaluation • Classification of unseen examples • Precision and Recall • Comparison of derivation trees Classification of unseen examples • Similar to Abbadingo • Placed fewest restrictions on competitors

Complexity of the Competition Tasks • Learning task should be sufficiently difficult. • Outside the current state-of-the-art, but not too difficult • Ideally provable that the training sentences are sufficient to identify the target language

Three axes of difficulty • Complexity of the underlying grammar • +ve/-ve or +ve only. • Similarity between -ve and +ve examples.

Complexity Measure of GI • Created a model of the GI based upon a brute force search (Non polynomial) • Complexity measure = size of the hypothesis space created when presented with a characteristic set.

Hypothesis Space for GI • All CFGs can be converted to Chomsky Normal Form. • For any sentence there are a finite number of unlabelled derivations given CNF • Finite number of labelled derivation trees • The grammar can be reconstructed given sufficient number of derivation trees • All possible labelled derivation trees corresponds to all possible CNF grammars given the maximum number of nterms • Solution: calculate max number of nterms and create all possible grammars

BruteForceLearner • Given the positive examples construct all possible grammars • Discard any grammars that generate any negative sentences • Randomly select a grammar from hypothesis set

Characteristic Set of Positive Sentences • Put the grammar into minimal CNF form • If a single rule is removed one or more sentences can't be derived • For each rule add a sentence that can only be derived using that rule • Such a sentence exists if G in minimal form • When presented with this set, one of the hypothesis grammars is correct

Characteristic set of Negative Sentences. • Given G calculate positive sentences • Construct hypothesis set • For each hypothesis H  G, L(H)  L(G) add + sentence s | s L(G) but s L(H) • For each hypothesis H  G, L(H)  L(G) add - sentence s | s L(H) but s L(G) Generating -ve data according to this technique requires exponential time – Therefore cannot be used to generate –ve data in Omphalos.

Creation of the Target Grammars • Benchmark probs identified in literature • Stolcke-93,Nakamura-02,Cook-76,Hopcroft-01 • Number of nterms, terms and rules were selected • Randomly generated grammars, useless rules removed, CF constructs (center recursion) added • A characteristic set of sentences was generated, and complexity measured • To test if deterministic, LR(1) tables created using Bison • For non-deterministic grammar non-deterministic constructs added

Creation of positive data • Characteristic set generated from grammar • Additional training examples added • Size of training set 10  20 size of characteristic set • Longest training example was shorter than the longest test

Creation of negative data • Not guaranteed to be sufficient • Originally randomly created (bad idea) • For probs 6a  10 regular equivalents to grammars constructed and negative data could be generated from regular equivalent to CFG • Nederhof-00 • Center recursion expanded to a finite depth Vs true center recursion • Equal number of positive and negative examples in the test sets

Participation • Omphalos 1st page: ~ 1000 hits from 270 domains • Attempted to discard crawlers and bots hits • All continents except 2 • Data sets : downloaded by 70 different domains • Oracle: 139 label submissions by 8 contestants (4) • Short test sets: 76 submissions • Large test sets: 63 submissions

Results

Techniques Used. • Prob 1 • Solved by hand. • Probs 3, 4, 5, and 6 • Pattern matching using n-grams. • Generated its own negative data • the majority of randomly generated strings would not be contained within the language. • Probs 2, 6.2, 6.4 • Distributional Clustering and ABL

Conclusions • The way in which negative data is created is crucial to judging performance of competitors entries

Review of Aims • Promote development of new and better GI algorithms • Partially achieved • A forum to compare different GI algorithms • Achieved • Provide an indicative measure of the state-of-the-art. • Achieved

Omphalos Session: Design and Results of the GI Competition

Omphalos Session: Design and Results of the GI Competition

Presentation Transcript

Session

Session

SESSION 3 CLOSED SESSION

SESSION

Session:

SESSION

SESSION

Session

Session

Trialogue Session (Session 1.3.1)

Session

Session 17 “Special Session”

Session

Session

Session

Session Title Session Subtitle

Session

SESSION TITLE (Session #)

Session:

Session

Omphalos Session