
Form layout understanding is a task of extracting and structuring information from scanned documents, and consists of primarily three tasks: (i) word grouping, (ii) entity labeling and (iii) entity linking. While the three tasks are dependent on each other, current approaches have solved each of these problems independently. In this work, we propose a multi-task learning approach to jointly learn all the three tasks simultaneously. Since the three tasks are related, the idea is to learn a shared embedding that can perform better on all three tasks. Further, the publicly available form understanding datasets are too small, and not ideal to train complex deep learning models. Multi-task learning is an effective method to provide some degree of regularization to the model for such small sized datasets. The proposed model, MTL-FoUn, outperforms existing approaches of learning the individual form understanding tasks on the publicly available data.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 2 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
