Scientific writing tips

An eclectic and subjective list

Wolfgang Huber



1. Present experimental observations and measurements in past tense. Use present tense for presuppositions, interpretations and conclusions. The statements in past tense are the real, hard substance, the novelty of an experimental or observational science paper.

2. Use active voice. Avoid passive voice. Use the active voice form to make it clear what you did, what others did, or what already pre-existed from your previously reported work.

3. Use one term per concept, and one concept per term. Avoid synonyms and lexical variation. These only add unnecessary mental load on the reader (‘Do they really mean the same? Or do they mean to leave room for some subtle distinction?’). Similarly, do not overload the same word with multiple meanings. An example is using the same word for measured data, and for what the data are intended to measure or represent. E.g., in RNA-seq, you may have collected for each gene the number of matching sequencing reads, and want it to mean the mRNA abundance, or even the encoded protein’s activity. They are (hopefully) related. But they are not the same. It is part of the scientific approach (and in fact logic and common sense) to care about such distinctions. Good scientists also know when it’s fine to gloss over, so they can move on with the narrative rather than burdening everyone with pedantry.

4. Define each term. Stick with the conventions of your field. Use textbooks, reviews, or at least Wikipedia. If not obvious, provide the reference that contains the standard definition. Have a very good reason if you make up your own definitions. In either case, try to keep your paper self-contained and provide the definition in your text. If you want to use fancy sounding words and technical terms, make sure you really understand their meaning, and that it is what you want. Dictionaries, Google and chatbots are your friends.

5. In the Results section, focus on what you did. Musings about what you or someone could have done instead, or why you did not, are (if at all) for Intro or Discussion.

6. The Results section should proceed step by step, in the right order, without gaps. E.g., in an experimental paper, from sample acquisition, to choice of measurement technology, to specific protocol used, to specifications of the data generated. Then, state the biological questions you asked, how you translated each into a statistical procedure, a visualization and a numerical summary, and how you interpret the result. Always step by step. Such that each intermediate step or result is explained before it is used by another step. Use short, simple sentences, and plain language (‘Hemingway style’).

7. In Intro and Discussion, find the right pitch for your story. Research happens in the grey zone between the well-known and the totally unknown. Make sure you describe your situation, i.e., what has been known, and what used to be unknown, or controversial, but now is not, thanks to you. Make sure you find the right framing. It is rarely necessary to paint the scientific background starting with Darwin in the 1860s or the invention of digital computers. But it is helpful to give a wide enough background for a target audience that has not been completely immersed in the topic for the last three years.

8. Be explicit and quantitative. Don’t say ‘few’ or ‘many’, but state the numbers and/or fractions quantitatively. Avoid most adjectives (e.g., high, good, efficient) and adverbs (e.g., fast). Instead, state the relevant physical quantities (e.g., speed, length, time interval) or numerical metrics (e.g., percent overlap, confusion table). Adverbs and adjectives are rarely needed in Results and Methods sections of papers.

9. Avoid dangling participles. Even if the intended meaning is clear. You may think you sound sophisticated, but it undermines you. Similarly for dangling pronouns.

10. Spell out abbreviations. You may think that you make the text more readable by shortening it. The opposite is the case. Many readers do not read the paper linearly from beginning to end, and instead jump around or start in the middle. Even if you think a certain abbreviation is obvious, you will be surprised by how many readers you lose with it.

11. Make each point once, and properly, and then move on. Avoid being repetitive. For instance, state the objective of the work, or of a particular portion of it, clearly and explicitly at the outset, and then just get on with describing the steps, and results. Do not keep repeating the motives, or a previous result.

12. Make sure there is logical progression. Avoid appearing circular. E.g., you can start a paragraph with stating a conclusion, and then provide your logical, fact-based argument for it, based on your premises and data. Or, start with the premises, and then go through the logical steps of data analysis to arrive at a conclusion. But do not put the conclusion, or the premises, both at the start and end of a paragraph. The reader will likely end up being puzzled about what was actually achieved.

13. If a previous paper in the field—even if, and in particular, a ‘high impact’ one— used sloppy terminology or sloppy thinking, don’t just uncritically regurgitate it. Improve, or at least take a stand. In science, we’re not paid to be sheep following the herd, but to be better than others.

14. If your paper involves data that you have generated: start early with uploading the data to the appropriate public repository, e.g., EBI. In particular the good journals have strict data reporting requirements. Bringing together your data (and metadata) to meet the minimal reporting standards of the public repository can take days or weeks. After data upload, sometimes several rounds of back-and-forth with the curators / operators of the database ensue, until the data are really accessible. You do not want to expend the time for this while a journal editor or reviewer are already impatiently waiting.

15. Putting terms in quotes tends to look ‘unprofessional’ and can in most cases be avoided by being careful about the choice of words.

16. If multiple spellings are possible for a term (e.g., hyphenation, capitalization), pick one and stick with it.

17. In biomedical studies, use the word ‘patients’ if you mean actual people, i.e., human beings with life histories, families, feelings, undergoing medical treatment. Use another word if you are just referring to blood samples or tissue biopsies.

18. Keep it simple and laconic. Buzzwords and overuse of technical jargon sound pompous and undermine you.

19. You don’t need to dump on other people’s work in order to make your work look good. It’s not a null sum game. You can talk respectfully about prior attempts at your question, and still claim a major advance.

20. Keep it lean. Periodically, go over the text and ask for every single word whether it is necessary. Does it carry new and relevant information? If not, remove it. E.g., prune for occurences of phrases of the form ‘To achieve X, we did (different word for X)’. Don’t say ‘X represents Y’ when it’s just ‘X is Y’. Doing so exposes you as careless of logical thinking.

21. Use a spell-checker.

22. Use a tool that allows collaborative editing in a team and keeps version history. There are several options. I like Quarto manuscripts for quantitative work that involves data visualizations and/or maths, and Googledocs otherwise.

23. Graphics: colors are a great way of encoding a categorical or continuous variable in your data. Choose the color map carefully. There is good advice on this on the internet. If your paper has multiple plots mentioning the same variable, make sure the color map is the same. Similarly, use distinct color maps for different variables.

24. Graphics: pay attention to font sizes. Make sure all text in figures is large enough to be legible. Avoid clutter and repetition. Make choice of font, font style and sizes consistent across figure panels and figures. In practice, this is surprisingly hard.

25. Graphics: for plots in which \(x\)- and \(y\)-axis show the same physical or conceptual units, make sure the aspect ratio is consistent, i.e., usually 1:1.