Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

Page Not Found

Page not found. Your pixels are in another canvas.

Personal web page

About me

Archive Layout with Content

Posts by Category

Posts by Collection

CV

Markdown

Page not in menu

This is a page not in th emain menu

Page Archive

Portfolio

Publications

Sitemap

Posts by Tags

Talk map

Talks and presentations

Teaching

Terms and Privacy Policy

Blog posts

Jupyter notebook markdown generator

Posts

Seven recommendations for machine translation evaluation

5 minute read

Published: December 15, 2020

Evaluating machine translation systems is not as obvious as it seems on first glance.

Using old training data or test sets

2 minute read

Published: December 14, 2020

In MT research there is a long-standing tradition of using old data sets when newer versions are available for the same language pair and domain.

Statistical significance testing

3 minute read

Published: December 14, 2020

In recent blog posts I have described many potential issues with MT evaluation. Surely statistical significance testing should help mitigate some of those problems? That may seem reasonable, but the truth is: it can be laughably easy to arrive at results that are statistically significant according to the most popular test in MT research, bootstrap resampling.

Single training runs and estimates of variance

5 minute read

Published: December 14, 2020

Consider a very simple example of a table reporting BLEU scores:

Simulating low-resource experiments

4 minute read

Published: December 14, 2020

Low-resource machine translation has been an active area of research for years. On a high level, what many papers on low-resource MT have in common is that they simulate low-resource scenarios.

Designing human evaluations and reporting the outcomes

3 minute read

Published: December 14, 2020

Every once in a while, there is an MT paper claiming to have achieved human parity (e.g. Hassan et al., 2018, Popel et al., 2020). To be fair, a message like

Computing and reporting BLEU scores

9 minute read

Published: December 14, 2020

BLEU scores are ubiquitous in MT research and they usually appear in tables that look like this:

Comparing to previous work

1 minute read

Published: December 14, 2020

What is common between many of the questionable practices I blog about is that they are seemingly legitimized by saying

portfolio

Portfolio item number 1

Short description of portfolio item number 1

Portfolio item number 2

Short description of portfolio item number 2

publications

talks

Talk 1 on Relevant Topic in Your Field

Published: March 01, 2012

This is a description of your talk, which is a markdown files that can be all markdown-ified like any other post. Yay markdown!

Tutorial 1 on Relevant Topic in Your Field

Published: March 01, 2013

More information here

Talk 2 on Relevant Topic in Your Field

Published: February 01, 2014

More information here

Conference Proceeding talk 3 on Relevant Topic in Your Field

Published: March 01, 2014

This is a description of your conference proceedings talk, note the different field in type. You can put anything in this field.

teaching