HELM Notation


The Hierarchical Editing Language for Macromolecules 

HELM was first conceived at Pfizer in the summer of 2008 to support the Pfizer oligonucleotide therapeutic unit and molecules were first registered into the Pfizer corporate database using HELM in December 2008.

HELM was designed to create a single notation that can encode the structure of all biomolecules.


Covers different types of macromolecule

  • DNA/RNA, Peptide and Chem polymer types already defined
  • Other polymer types can be added.

Allows non-natural monomers

  • Monomers are defined by the HELM author, so there is no limit on the monomers that can be included in your molecule.

Is portable

  • xHELM allows you to ‘bundle’ all the monomer information with your molecule definition into a single package that can be used to transfer information outside your organisation.


Adds the ability to define 

  • Unknown sections of or entire polymers
  • Repeating units
  • Annotations
  • Unknown connection points
  • Multiple possible connection points
  • Probabilities of different monomers or polymers
  • Mixtures

How it works

HELM contains multiple levels of information:

  • Monomers - the atom/bond representation of the building blocks
  • Simple polymers - a linear sequence of monomers of the same type
  • Complex polymers - combinations of simple polymers, hydrogen bond information and annotations

The flexibility allows you to define molecules like this which include a oligonucleotide connected to a small molecule (SMCC) connected to a peptide. 


The HELM 2.04 specification is available below.

There are minor changes in the 2.04 update. Specifically:

  • Added links to the monomer JSON schema.
  • Added location of xml schema for xHELM.
  • Clarified mandatory use of ? for connection points with unknown monomers like *, X and N.
  • Clarified rules for the use of in-line HELM particularly around connection point definition.
  • Minor revisions to aid clarity.

Test set

The team have compiled a set of around 150 structures that illustrate the full range of structures that can be encoded in HELM.This is included as a resource for anyone who wants to implement their own HELM tools rather than use the open source toolkit.

Peptide Monomer Guidelines

We have developed recommendations for creating and naming a monomer set for peptides. 

Nucleotide Monomer Guidelines

More Information

An academic overview of HELM and its origins

HELM 1.0 overview

xHELM overview



 HELM 2.0 quick reference diagram

HELM 2.0 Design Document

Please note - this was the document used to discuss HELM 2.0 and is not the definitive specification of the changes. It does illustrate the thinking and provide some examples that may be useful to groups working with HELM 2.0. There are a few errors in the HELM strings and some require monomers not in the standard set, so use with caution. 

HELM 2 Syntax editing

If you create HELM manually it can be helpful to use an editor that will highlight different parts of the string. We will collect suggestions from the community here:

Notepad++ highlighting definition file

HELM Syntax Highlighting2.xml

To use this language definition file go to [Language] -> [Define you language...]. Here you have to click on [Import...] and load the xml file. After closing the dialog, you have to restart Notepad++. Now you  can assign the language to you current open file via [Language] -> [HELM2].