Skip to main content
Academic Writing7 min read

Cite Datasets in MLA 9, APA 7, Chicago & IEEE

Cite Zenodo, Figshare, Dryad & ICPSR datasets in MLA 9, APA 7, Chicago & IEEE — with DataCite DOIs and worked examples for repositories.

CiteMe Editorial Team

CiteMe Editorial Team

Academic Research Team

(Updated )
Share

Why citing datasets matters (and what counts as one)

A research dataset is a structured collection of data produced or assembled as part of scholarship — survey responses, genomic sequences, interview transcripts, measurement tables, image corpora, and more. Citing datasets is now an expectation in most scientific disciplines: funders (NIH, NSF, UKRI, Horizon Europe) require data-management plans; journals (Nature, Science, PLOS) require dataset citations in the reference list; and reproducibility norms demand that the data you analyzed is discoverable by future readers.

Most datasets live in dedicated repositories: Zenodo, Figshare, Dryad, ICPSR, OpenNeuro, Gene Expression Omnibus (GEO), the UK Data Service, or a domain-specific archive. Each issues a persistent identifier — usually a DataCite DOI, sometimes an accession number (e.g., GEO series GSE12345). The DOI is the single most important field in a dataset citation; without it, your citation degrades to "personal communication" status.

How to cite a dataset in APA 7th edition

APA 7 treats datasets as standalone publications. Template: Author, A. A., Author, B. B., & Author, C. C. (Year). Title of dataset (Version X.X) [Data set]. Repository. https://doi.org/xxxxx

APA 7 — dataset with DataCite DOI
Last, A. B., & Other, C. D. (2024). Title of the dataset (Version 1.2) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.1234567
  • The "[Data set]" bracket label is required by APA 7 §10.9 to distinguish dataset sources from articles or books
  • Include the version number when the dataset is versioned (Zenodo versions each release; others may not)
  • The repository name (Zenodo, Figshare, ICPSR) replaces "publisher"
  • Cite the specific version you analyzed, not just the "latest" concept DOI, if reproducibility matters

How to cite a dataset in MLA 9th edition

MLA 9 uses the same container-based approach as for other sources. Template: Author Last, First. "Title of Dataset." Repository, Publication Year, DOI or URL.

MLA 9 — dataset
Last, Firstname, and Firstname Other. "Title of the Dataset." Zenodo, 2024, doi.org/10.5281/zenodo.1234567.

MLA 9 does not require a bracket label ("[Data set]") — the repository name implicitly signals the source type. If disambiguation would help the reader, add "Dataset" as an optional descriptor after the title.

How to cite a dataset in Chicago 17th edition

Chicago author-date (§14.257): Last name, First name. Year. "Title of Dataset." Version. Repository. DOI.

Chicago author-date — dataset
Last, Firstname, and Firstname Other. 2024. "Title of the Dataset." Version 1.2. Zenodo. https://doi.org/10.5281/zenodo.1234567.

Chicago notes-bibliography style allows both a footnote form and a bibliography entry. The bibliography entry has the same structure as author-date above; the footnote flattens it into a single line:

Chicago notes-bibliography — first footnote
Firstname Last and Firstname Other, "Title of the Dataset," Version 1.2, Zenodo, 2024, https://doi.org/10.5281/zenodo.1234567.

How to cite a dataset in IEEE

IEEE uses a numbered reference with "Dataset" or a similar descriptor. Template: [N] A. Last and B. Other, "Title of dataset," Repository, 2024. [Online]. Available: DOI

IEEE — dataset reference entry
[12] A. Last and B. Other, "Title of the dataset," Zenodo, 2024, doi: 10.5281/zenodo.1234567.

IEEE is common in engineering and computer science, where datasets from benchmark collections (ImageNet, COCO, LibriSpeech) are cited constantly. Always include the version — benchmark datasets evolve and a later version may have different results.

How to cite a dataset in Vancouver / ICMJE

Vancouver — dataset
Last A, Other B. Title of the dataset [Internet]. Zenodo; 2024 [cited 2024 Mar 15]. Version 1.2. Available from: https://doi.org/10.5281/zenodo.1234567

Vancouver / ICMJE guidance for datasets appears in NLM's Citing Medicine; the [Internet] tag and access date [cited…] are standard conventions. Biomedical datasets (GEO, SRA, dbGaP, UK Biobank) usually have their own accession-number conventions — include the accession number alongside the DOI when both exist.

Versions, DataCite, and persistent identifiers

Most research data is versioned. Zenodo assigns each release its own DOI AND issues a "concept DOI" that always resolves to the latest version. Cite the specific-version DOI, not the concept DOI, when reproducibility matters — the concept DOI may resolve to a different dataset than the one you used six months later.

  • DataCite DOIs look like 10.5281/zenodo.XXXXXXX (Zenodo), 10.6084/m9.figshare.XXXXX (Figshare), 10.5061/dryad.XXXXX (Dryad) — the prefix identifies the repository
  • Accession numbers (GSE12345, PRJNA67890) are equally valid for some domains — cite them alongside the DOI when the repository uses both
  • If there is truly no DOI or accession number, the dataset is "unpublished data" — describe it in the text with obtainable contact details, not as a formal citation

Common mistakes when citing datasets

  • Citing the paper that describes the data instead of the data itself — both are valid, but separate references
  • Using the "concept DOI" instead of the version DOI when results depend on a specific data snapshot
  • Missing the "[Data set]" label in APA — APA 7 §10.9 requires this bracket tag
  • Treating a GitHub release as equivalent to a Zenodo DOI — it isn't. Link GitHub repos to Zenodo to get a citable DOI
  • Forgetting to credit all co-creators — dataset author lists are often long; copy from the repository landing page, don't shorten arbitrarily

Ready to cite your sources?

Generate accurate citations from real academic databases. No AI hallucinations.

Related Articles