This week marks the 50th issue of The Neural MT Weekly. This is a remarkable milestone as we originally set out for this to be an 8-part series. However, as the pace of research in this area continued without cease, and our readership grew, we simply had to keep going. The research into these posts forms part of our weekly MT reading group in Iconic anyway, so they also represent a good opportunity for our team to stay on top of the latest developments.
To mark this 50th issue, and with the 17th Machine Translation Summit taking place next week on our doorstep, we decided to preview some of the papers from the research track that caught our teams' eye in the run up to the conference. These are just a few of dozens of great papers so we recommend you take a look at the full proceedings also. Check out our highlights below.
A Call for Prudent Choice of Subword Merge Operations in Neural Machine Translation
Ding et al. (2019) investigate the impact of the subword vocabulary size in low-resource conditions. The study is performed on IWSLT 2016 data (2 to 5 million training words). While vocabulary sizes of about 30,000 are very popular, this paper shows that this is not the optimal choice in low-resource conditions. For transformer models, the optimal choice lies between 0 and 4,000. For LSTM models, the subword vocabulary size has less impact and no optimal choice stands out. Using joint or separate subword vocabularies has little impact. The authors also study the correlation of subword vocabulary size with different linguistic language characteristics. Finally, they find out that their findings do not hold in high-resource conditions.
When: Poster session, Thursday August 22nd, 16:00-17:30
Character-Aware Decoder for Translation into Morphologically Rich Languages
Renduchintala et.al., (2019) proposed a character aware decoder incorporating lower level patterns of morphology in Neural MT. The character aware decoder is mainly effective when translating into morphologically rich languages. The character awareness is achieved by augmenting both the softmax with embedding layers with convolutional neural networks that operates on the spelling of the word. In the low resource setting, they reported consistent improvements with BLEU gains up to +3.05.
When: Research Track 4, Friday August 23rd, 10:00-10:30
An Intrinsic Nearest Neighbor Analysis of Neural Machine Translation Architectures
In another interesting paper by Ghader and Monz (2019), they proposed an intrinsic way of comparing neural machine translation architectures by looking at the nearest neighbours of the encoder hidden states. In this paper they compared RNNs and Transformer model. They compared both models on the basis of syntactic and semantic similarity and found out that Transformer models are better at capturing semantics while RNNs models are better at capturing syntax.
When: Research Track 2, Thursday August 22nd, 9:30-10:00
Last but not least, it would be remiss of us not to mention our own paper "Improving Robustness in Real-World Neural Machine Translation". We won't go into the details again because we covered this previously in Issue #45 of this series. Nevertheless, some of our team will be attending the conference and you can come and meet them and talk about this paper, and any of the other topics they've covered over the first 49 issues of The Neural MT Weekly!
When: Poster session, Wednesday August 21st, 16:00-17:30
We look forward to seeing you in Dublin at the summit. We will also be hosting an MT Social evening on Thursday August 22nd, so make sure to drop by if you're there: https://www.mtsummit2019.com/social-programme