PubMedQA

A Dataset for Biomedical Research Question Answering

About

The task of PubMedQA is to answer research questions with yes/no/maybe (e.g.: Do preoperative statins reduce atrial fibrillation after coronary artery bypass grafting?) using the corresponding abstracts.

For more details about PubMedQA, please refer to this paper:

Dataset

PubMedQA has 1k expert labeled, 61.2k unlabeled and 211.3k artificially generated QA instances.

Please visit our GitHub repository to download the dataset:

Submission

To submit your model, please follow the instructions in the GitHub repository.

Citation

If you use PubMedQA in your research, please cite our paper by:

@inproceedings{jin2019pubmedqa,
  title={PubMedQA: A Dataset for Biomedical Research Question Answering},
  author={Jin, Qiao and Dhingra, Bhuwan and Liu, Zhengping and Cohen, William and Lu, Xinghua},
  booktitle={Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)},
  pages={2567--2577},
  year={2019}
}
Leaderboard
Model Code Accuracy (%) Macro-F1 (%)
1
Sep 13, 2019
Human Performance (single annotator)
University of Pittsburgh & Carnegie Mellon University
(Jin et al. 2019)
78.00 72.19
2
Sep 13, 2019
Baseline Model (single model)
University of Pittsburgh & Carnegie Mellon University
(Jin et al. 2019)
68.08 52.72