Accurately citing data, code and preprints is essential for research transparency, reproducibility and academic integrity. These sources differ from traditional books and journal articles, so treating them correctly strengthens your methodology, avoids plagiarism, and meets examiner and institutional expectations.
Why special rules matter
- Data can be updated, versioned or removed; you must identify the exact dataset and version used.
- Code and software often evolve; citing code with a persistent identifier enables reproducibility.
- Preprints are not peer-reviewed; noting their status protects you and your reader.
Proper citation:
- Gives credit to creators (ethical referencing).
- Enables others to reproduce analyses.
- Satisfies institutional checks and examiners’ expectations (see Institutional Policies and Academic Integrity Checks for Dissertations, Essays and Assignments: What Examiners Look For).
Core elements to include (for every non-traditional source)
Always try to include:
- Creator/author(s) (individual(s) or organisation)
- Title (dataset name, code repo name or preprint title)
- Year of publication or release
- Version (if applicable)
- Repository or host (e.g., Zenodo, GitHub, Figshare, arXiv, bioRxiv)
- Persistent identifier (DOI, Handle, accession number) — if none, include a stable URL and access date
- Licence (especially for code)
For guidance on citation styles and formatting, see Mastering Citation Styles for Dissertations, Essays and Assignments: APA, MLA, Chicago and More.
How to cite datasets
Best practice: cite the dataset as you would a publication — include DOI and version.
Example APA-like structure:
- Creator(s). (Year). Title (Version) [Data set]. Repository. DOI
When no DOI exists:
- Provide repository, a stable URL, and the date you accessed the data.
Key notes:
- Mention any dataset processing (filters, merges) in your methods.
- If using sequence or genomic data, include accession numbers.
See also: Using DOI, ISBN and Persistent Identifiers Correctly in Dissertations, Essays and Assignments.
How to cite code and software
Software and code can be cited as software, a dataset (if archived), or a repository entry. Prefer a DOI by archiving a release (e.g., Zenodo minting DOIs for GitHub releases).
Include:
- Author/organisation
- Year
- Title
- Version/release number
- Repository or archive and DOI or URL
- Licence
If only a GitHub repo is available:
- Cite repository name, year (last commit), URL and access date.
- Mention the commit hash in methods for precision.
See tool recommendations: Reference Management for Dissertations, Essays and Assignments: Zotero, EndNote and Mendeley Compared.
How to cite preprints
Preprints should be clearly labelled as such in citations.
Include:
- Author(s)
- Year
- Title
- “Preprint” label and preprint server (e.g., bioRxiv, arXiv)
- DOI or preprint identifier (arXiv ID)
- If available, link to the peer-reviewed version and note differences
Because preprints are not peer-reviewed, explicitly state their status when discussing conclusions. Consult examiner expectations in Institutional Policies and Academic Integrity Checks….
Quick citation examples (APA 7 style — illustrative)
| Resource type | Reference entry (APA-style example) | In-text |
|---|---|---|
| Dataset (with DOI) | Smith, J., & Ali, R. (2021). South African rainfall dataset (v2) [Data set]. Figshare. https://doi.org/10.xxxx/figshare.xxxx | (Smith & Ali, 2021) |
| Software (archived with DOI) | Lee, M. (2022). ClimateTools (v1.0) [Software]. Zenodo. https://doi.org/10.xxxx/zenodo.xxxx | (Lee, 2022) |
| Code repo (GitHub, no DOI) | Patel, A. (2023). survey-analysis (commit 7f8e2a) [Code]. GitHub. https://github.com/patel/survey-analysis (accessed 2024-03-12) | (Patel, 2023) |
| Preprint (arXiv) | Nkosi, T., & Moyo, L. (2024). Machine learning model for soil moisture prediction [Preprint]. arXiv. https://arxiv.org/abs/2401.xxxx | (Nkosi & Moyo, 2024) |
In-text citation strategies and integrity
- For complex sources and composite citations, follow guidance in In-Text Citation Strategies for Complex Sources in Dissertations, Essays and Assignments.
- When paraphrasing or quoting code comments or dataset documentation, apply the same rules as for text — cite the source and avoid copying extensive non-code text verbatim (see How to Avoid Plagiarism in Dissertations, Essays and Assignments: Paraphrasing, Quoting and Attribution Rules).
- For reused figures/tables from datasets or preprints, get permission if required and clearly attribute the source in the caption.
Best-practice checklist before submission
- Record the exact dataset version, code commit (or DOI) and preprint identifier in your notes.
- Archive code releases (Zenodo + GitHub) to obtain DOIs where possible.
- State licences for code and data, and check reuse permissions.
- Label preprints and discuss limitations.
- Use a reference manager to track non-traditional sources — see Reference Management….
- Run a final audit using the Reference Audit Checklist: Ensure Complete and Accurate Citations in Dissertations, Essays and Assignments.
- Cross-check bibliography formatting against Creating Perfect Reference Lists and Bibliographies for Dissertations, Essays and Assignments: Common Mistakes to Fix.
Troubleshooting common issues
- No DOI for data/code: archive a copy in a trusted repository or provide a stable URL + access date.
- Multiple authors or institutional authors: use the organisation name as author if no individuals are listed.
- Secondary use of data or classic texts within datasets: follow guidance in Dealing with Secondary Sources and Classic Texts in Dissertations, Essays and Assignments: Ethical Referencing.
Tools & further reading
- Use reference managers to store dataset DOIs, software releases and preprint records. See Reference Management for Dissertations, Essays and Assignments: Zotero, EndNote and Mendeley Compared.
- For examiners’ expectations, read Institutional Policies and Academic Integrity Checks….
- For final formatting, consult Mastering Citation Styles….
Contact us (writing & proofreading help)
If you need help applying these practices to your dissertation, essay or assignment — or want professional proofreading and referencing support — contact MzansiWriters:
- Click the WhatsApp icon on the page,
- Email: info@mzansiwriters.co.za, or
- Visit the Contact Us page from the main menu.
Need a reference audit or source-check? Ask us to run a professional check against examiner expectations and institutional policies.