Package Guidelines


The Bioconductor project promotes high-quality, well documented, and interoperable software. These guidelines help to achieve this objective; they are not meant to put undue burden on package authors, and authors having difficultly satisfying guidelines should seek advice on the bioc-devel mailing list.

Package maintainers are urged to follow these guidelines as closely as possible when developing Bioconductor packages.

General instructions for producing packages can be found in the Writing R Extensions manual, available from within R (RShowDoc("R-exts")) or on the R web site.

[ Back to top ]

Types of Packages

Most packages contributed by users are software packages that perform analytic calculations. Users also contribute annotation and experiment data packages.

Annotation packages are database-like packages that provide information linking identifiers (e.g., Entrez gene names or Affymetrix probe ids) to other information (e.g., chromosomal location, Gene Ontology category). It is also encouraged to utilize AnnotationHub for storage and access to large raw data files and their conversion to standard R formats. Instructions for adding data to AnnotationHub and designing a annotaiton package to use AnnotationHub can be found here: Creating AnnotationHub Packages.

Experiment data packages provide data sets that are used, often by software packages, to illustrate particular analyses. These packages contain curated data from an experiment, teaching course or publication and in most cases contain a single data set. It is also encouraged to utilize ExperimentHub for storage and access to larger data files. ExperimentHub is also particularly useful for hosting collections of related data sets. Instructions for adding data to ExperimentHub and designing an experiment data package to use ExperimentHub can be found here: Creating ExperimentHub Packages.

An excellent practice is to develop a software package, and to provide or use an existing experiment data package or data in ExperimentHub to give a comprehensive illustration of the methods in the software package. If the data files of a package are larger than 100 MB but less than 2 GB, Bioconductor now supports the use of Git Large File Storage (Git LFS) during package contribution. Please be aware Git LFS is free for all users up to 1 GB of data and a monthly usage of 1 GB of bandwidth; more data and bandwidth can be purchases at the contributers expense. For larger files, it may be worth while to explore using the Hubs.

The guidelines below apply to all packages, but annotation and experiment data packages are not required to conform to the space limitations of software packages. Developers wishing to contribute annotation or experiment data packages should seek additional support associated with package submission.

[ Back to top ]

Version of Bioconductor and R

Package developers should always use the devel version of Bioconductor when developing and testing packages to be contributed.

Depending on the R release cycle, using Bioconductor devel may or may not involve also using the devel version of R. See the how-to on using devel version of Bioconductor for up-to-date information.

[ Back to top ]

Correctness, Space and Time

Bioconductor packages must pass R CMD build (or R CMD INSTALL --build) and pass R CMD check with no errors and no warnings using a recent R-devel. Authors should also try to address all notes that arise during build or check.

Packages must also pass R CMD BiocCheck with no errors and no warnings. The BiocCheck package is a set of tests that encompass Bioconductor Best Practices. Every effort should be made to address any notes that arise during this build or check.

Do not use filenames that differ only in case, as not all file systems are case sensitive.

The source package resulting from running R CMD build should occupy less than 4MB on disk. The package should require less than 5 minutes to run R CMD check --no-build-vignettes. Using the --no-build-vignettes option ensures that the vignette is built only once.

Vignette and man page examples should not use more than 3GB of memory since R cannot allocate more than this on 32-bit Windows.

For software packages, individual files must be <= 5MB. This restriction exists even after the package is accepted and added to the Bioconductor repository.

These requirement are the minimum for package acceptance and will still be subject to other guidelines below and a formal technical review by a Bioconductor team member.

[ Back to top ]

Package Name

Choose a descriptive name. An easy way to check whether your name is already in use is to check that the following command fails

## try http:// if https:// URLs are not supported

Avoid names that are easily confused with existing package names, or that imply a temporal (e.g., ExistingPackage2) or qualitative (e.g., ExistingPackagePlus) relationship.

[ Back to top ]


The “License:” field in the DESCRIPTION file should preferably refer to a standard license (see wikipedia) using one of R’s standard specifications. Be specific about any version that applies (e.g., GPL-2). Core Bioconductor packages are typically licensed under Artistic-2.0. To specify a non-standard license, include a file named LICENSE in your package (containing the full terms of your license) and use the string “file LICENSE” (without the double quotes) in the “License:” field of your DESCRIPTION file.

[ Back to top ]

Package Content

Packages must

[ Back to top ]

Package Dependencies

Packages you depend on must be available via Bioconductor or CRAN; users and the automated build system have no way to install packages from other sources.

Reuse, rather than re-implement or duplicate, well-tested functionality from other packages. Specify package dependencies in the DESCRIPTION file, listed as follows

A package may rarely offer optional functionality, e.g., visualization with rgl when that package is available. Authors then list the package in the Suggests field, and use requireNamespace() (or loadNamespace()) to condition code execution. Functions from the loaded namespace should be accessed using :: notation, e.g.,

x <- sort(rnorm(1000))
y <- rnorm(1000)
z <- rnorm(1000) + atan2(x,y)
if (requireNamespace("rgl", quietly=TRUE)) {
rgl::plot3d(x, y, z, col=rainbow(1000))
} else {
## code when "rgl" is not available

This approach does not alter the user search() path, and ensures that the necessary function (plot3d(), from the rgl package) is used. Such conditional code increases complexity of the package and frustrates users who do not understand why behavior differs between installations, so is often best avoided.

[ Back to top ]

S4 Classes and Methods

Re-use existing functionality, especially for S4 input methods and S4 classes. This encourages interoperability and simplifies your own package development.

If your data requires a new representation or function, carefully design an S4 class or generic so that other package developers with similar needs will be able to re-use your hard work, and so that users of related packages will be able to seamlessly use your data structures. Do not hesitate to ask on the Bioc-devel mailing list for advice. Be sure to implement the essential S4 interface.

Implement a constructor (typically a simple function) if the user is supposed to be able to create an instance of your class. Write short accessors (functions or methods) if the user needs to extract from or assign to slots in the class. Constructors and accessors help separate the interface seen by the user from the implementation details relevant to the developer.

The following layout is sometimes used to organize classes and methods; other approaches are possible and acceptable.

A Collates: field in the DESCRIPTION file may be necessary to order class and method definitions appropriately during package installation.

[ Back to top ]

Robust and Efficient Code

Many R operations are performed on the whole object, not just the elements of the object (e.g., sum(x), not x[1] + x[2] + …). In particular, relatively few situations require an explicit for loop. See the Vectorize section of Robust and Efficient Code for additional detail. See also Coding Style for advice on common coding syntax.

[ Back to top ]

Querying Web Resources

Packages that rely on access to web resources need to be written carefully. Web resources can change location, can be temporarily unavailable, or can be very slow to access and retrieve. Functions that query web resources, should anticipate and handle such situations gracefully – failing quickly and clearly when the resource is not available in a reasonable time frame. See Querying Web Resources for additional detail and examples of robust web-query functions.

[ Back to top ]

Parallel Recommendations

We recommend using BiocParallel which provides a consistent interface to the user and supports the major parallel computing styles: forks and processes on a single computer, ad hoc clusters, batch schedulers and cloud computing. By default, BiocParallel chooses a parallel back-end appropriate for the OS and is supported across Unix, Mac and Windows. Coding requirements for BiocParallel are:

For more information see the BiocParallel vignette.

[ Back to top ]

End-User Messages

[ Back to top ]

Graphics Device

Use to start a graphics device if necessary. Avoid using x11() or X11() for it can only be called on machines that have access to an X server.

[ Back to top ]


A vignette demonstrates how to accomplish non-trivial tasks embodying the core functionality of your package. There are two common types of vignettes. A Sweave vignette is an .Rnw file that contains LaTeX and chunks of R code. The R code chunk starts with a line «»=, and ends with @. Each chunk is evaluated during R CMD build, prior to LaTeX compilation to a PDF document. An R markdown vignette is similar to a Sweave vignette, but uses markdown instead of LaTeX for structuring text sections and resulting in HTML output. The knitr package can process most Sweave and all R markdown vignettes, producing pleasing output. Refer to Writing package vignettes for technical details. See the BiocStyle package for a convenient way to use common macros and a standard style.

A vignette provides reproducibility: the vignette produces the same results as copying the corresponding commands into an R session. It is therefore essential that the vignette embed R code between «»= and @; short-cuts (e.g., using a LaTeX verbatim environment, or using the Sweave eval=FALSE flag, or equivalent tricks in markdown) undermine the benefit of vignettes.

All packages are expected to have at least one vignette. Vignettes go in the vignettes directory of the package. Vignettes are often used as stand-alone documents, so best practices are to include an informative title, the primary author of the vignette, the last modified date of the vignette, and a link to the package landing page.

[ Back to top ]


Appropriate citations must be included in help pages (e.g., in the see also section) and vignettes; this aspect of documentation is no different from any scientific endeavor. The file inst/CITATION can be used to specify how a package is to be cited.

Whether or not a CITATION file is present, an automatically-generated citation will appear on the package landing page on the Bioconductor web site. For optimal formatting of author names (if a CITATION file is not present), specify the package author and maintainer using the Authors@R field as described in Writing R Extensions.

[ Back to top ]

Version Numbering

All Bioconductor packages use an x.y.z version scheme. The following rules apply:

When first submitted to Bioconductor, a package usually has version 0.99.0. For more details, see Version Numbering

[ Back to top ]

C or Fortran code

If the package contains C or Fortran code, it should adhere to the standards and methods described in the System and foreign language interfaces section of the Writing R Extensions manual. In particular:

Third-party code

Use of external libraries whose functionality is redundant with libraries already supported is strongly discouraged. In cases where the external library is complex the author may need to supply pre-built binary versions for some platforms.

By including third-party code a package maintainer assumes responsibility for maintenance of that code. Part of the maintenance responsibility includes keeping the code up to date as bug fixes and updates are released for the mainline third-party project.

For guidance on including code from some specific third-party sources, see the external code sources section of the C++ Best Practices guide.

[ Back to top ]

Unit Tests

Unit tests are highly recommended. We find them indispensable for both package development and maintenance. Examples and explanations are provided here.

[ Back to top ]

URLs and Videos

Add a “URL:” field in your DESCRIPTION file to direct users to source code repositories, additional help resources, etc; details are provided in “Writing R Extensions”, RShowDoc("R-exts").

You can submit an instructional video along with your package. In the DESCRIPTION file of your package, add a “Video:” line which contains the link to your video. We will then feature your video on our Bioconductor YouTube Channel.

[ Back to top ]

Duplication of Packages in CRAN and Bioconductor

Authors are strongly discouraged from placing their package into both CRAN and Bioconductor. This avoids burdening the author with extra work and confusing the user.

[ Back to top ]

Package Author and Maintainer Responsibilities

Acceptance of packages into Bioconductor brings with it ongoing responsibility for package maintenance. These responsibilities include:

All authors mentioned in the package DESCRIPTION file are entitled to modify package source code. Changes to package authorship require consent of all authors.

[ Back to top ]

Source Code & Build Reports »

Source code is stored in Git.

Software packages are built and checked nightly. Build reports:


Development Version »

Bioconductor packages under development:

Developer Resources: