Seminario 15 de xuño
Fecha de publicación: 15-may-2012 12:06:42
O próximo día 15 de xuño a rede RELISCO organizará un seminario na aula 2.1a da Facultade de Informática da Universidade da Coruña. Próximamente incluiremos un programa detallado con todos os participantes e os horarios correspondentes:
Xavier Carreras (UPC)
Autores: Xavier Carreras, Michael Collins e Terry Koo
Título: "A TAG formalism for Parsing and Translation"
Syntactic parsing is the fundamental problem of determining the structure of natural language sentences. It is a challenging task, because syntactic structures of natural languages are recursive, and there is a significant degree of ambiguity in determining how different parts of a sentence combine together syntactically. In any computational model for parsing, the choice of grammar formalism is critical to both the representational power of the model and its computational efficiency. In this talk I will describe a variant of a Tree Adjoining Grammar (TAG) that can use a wide variety of rich features and, at the same time, has efficient algorithms. I will present two applications of our TAG. The first is a discriminative parser, a generalization of Conditional Random Fields for structured prediction that extends the framework to syntactic parsing. The second application is machine translation, where we frame the problem as a parsing task. The TAG-based translation system makes direct use of syntactic structures in modeling differences in word order between different languages, and in modeling the grammaticality of translation output. In both applications we show improvements over state-of-the-art systems.
André Martins (Carnegie Mellon University)
Autores: André Martins, Noah Smith, Mário Figueirido, Eric Xing e Pedro Aguiar
Título: "Turbo Parsing and Constrained Inference with AD^3"
In the first part of this talk, I will present AD^3 (Alternating Direction Dual Decomposition), a new decoding algorithm for approximate LP-MAP inference in constrained factor graphs. The LP-MAP approximation consists in ignoring global effects caused by the cycles of the graph, and can be seen as a linear relaxation of the original problem. The proposed algorithm can handle arbitrary first-order logic constraints and is suitable to massive decompositions, unlike previously proposed dual decomposition algorithms. As an intermediate step, it requires solving small quadratic programs, for which I provide closed form solutions or efficient procedures.
In the second part of the talk, I will apply this methodology to dependency syntax with rich-feature models. I will start by formulating dependency parsing as a concise integer linear program, which is relaxed for tractability. A constrained factor graph is then constructed for this problem and the relaxation is shown to be equivalent to LP-MAP inference in such graph. The resulting framework is called "turbo parsing," and includes as particular cases other parsers proposed in the literature. Finally, I will apply AD^3 for solving the relaxation. Experiments in 14 languages yield state-of-art results.
Carlos Gómez Rodríguez (Universidade da Coruña)
Autores: Carlos Gómez Rodríguez e Daniel Fernández-González.
Título: "Undirected Parsing and Buffer Transitions: Two Approaches to Improve Transition-Based Dependency Parsers"
A dependency parser is a system that can be used to automatically obtain the structure of natural language sentences, as expressed by directed links (dependencies) between words. One of the most widely-used types of dependency parsers are transition-based parsers, which achieve this by using a non-deterministic state machine and a model that scores transitions between its states. In this talk, I will present two different approaches to modify existing transition-based dependency parsers in order to improve their accuracy.
In the first approach, we transform the dependency parsers into variants which build an undirected graph rather than a (directed) dependency structure. The undirected graph is then converted into a directed dependency tree in a post-processing step. This technique
alleviates error propagation, as undirected parsers do not need to observe the single-head constraint.The second approach consists of enriching the parsers with simple transitions that act on buffer nodes. We define two sets of such transitions: projective buffer transitions, which create a left or right links of length one between the first two buffer nodes; and non-projective buffer transitions, which create links involving the second buffer node and the topmost stack node, allowing a limited form of non-projectivity.
Pablo Gamallo (Universidade de Santiago de Compostela)
Título: A Depurative Strategy for Dependency Parsing with Finite State Transducers
We describe a dependency parsing strategy based on finite state transducers, which minimizes the complexity of rules/transducers by using a technique we call /depurative/. Depurative parsing is driven by the "single-head" constraint of Dependency Grammar, and can be seen as an alternative method to the standard /constructive/ strategy. It simplifies the input string by progressively identifying and removing those words that were recognized as /dependents/ by each transducer. At the end of the depurative process, if all the dependencies in the sentence were identified, the input string should contain just one token representing the main head of the sentence. This finite-state strategy was inspired by the /Right/ and /Left Reduce/ operations used in deterministic dependency parsing.