About
The task of PubMedQA is to answer research questions with yes/no/maybe (e.g.: Do preoperative statins reduce atrial fibrillation after coronary artery bypass grafting?) using the corresponding abstracts.
For more details about PubMedQA, please refer to this paper:
Dataset
PubMedQA has 1k expert labeled, 61.2k unlabeled and 211.3k artificially generated QA instances.
Please visit our GitHub repository to download the dataset:
Submission
To submit your model, please follow the instructions in the GitHub repository.
Citation
If you use PubMedQA in your research, please cite our paper by:
@inproceedings{jin2019pubmedqa, title={PubMedQA: A Dataset for Biomedical Research Question Answering}, author={Jin, Qiao and Dhingra, Bhuwan and Liu, Zhengping and Cohen, William and Lu, Xinghua}, booktitle={Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)}, pages={2567--2577}, year={2019} }
Leaderboard
Model | Code | Accuracy (%) | Macro-F1 (%) | |
---|---|---|---|---|
1 Sep 13, 2019 |
Human Performance (single annotator) University of Pittsburgh & Carnegie Mellon University (Jin et al. 2019) |
78.00 | 72.19 | |
2 Sep 13, 2019 |
Baseline Model (single model) University of Pittsburgh & Carnegie Mellon University (Jin et al. 2019) |
68.08 | 52.72 |