Changing the direction of backbone connections

HELM is a directional notation. For example, in RNA sequences the start of the chain is the 5’ end of the polymer and the end is the 3’ end.

 

RNA1{R(C)P.R(A)}$$$$V2.0

 

This is encoded by the definition of the backbone connection.  R1 is connected to R2

 

 

However, there are times where the molecule you want to encode has a different backbone configuration, for example when the connection is reversed.

 

There are 2 options:

  1. Create a manually defined connection between different simple polymers.

  2. Define a new monomer with the R group numbers changed to reflect your desired connection points.

 

We recommend option 2.

 

If you use option 1, the original monomer, the HELM string would look like this:

RNA1{P}|RNA2{R(C)}|RNA3{R(A)}$RNA1,RNA3,1:R2-1:R1|RNA1,RNA2,1:R1-1:R1$$$V2.0

It would take the user longer to construct using the available editing tools and is more prone to error. We don’t recommend this approach.

 

Creating a reversed connection using a new monomer

To reverse the connection in the example above, the user should define the monomer with the R1 and R2 groups swopped over. So, a reversed ribose monomer would look like this:

 

The guidelines define monomers where the capped structure is the same, but the connection points differ as different monomer, so both forms can be stored in a monomer library.

We also recommend that the monomer is named with the connection as a prefix to the monomer symbol so, in this case R becomes [35R] to show that it is now 3,5 connectivity.

Other common sugars would be numbered in a similar way:

               R              -> [35R]

               [dR]       -> [35dR]

               [mR]      -> [35mR]

               [fR]        -> [35fR]

 

So our original example becomes RNA1{[35R](C)P.[35R](A)}$$$$V2.0, the connection is reversed and the 5’ end is now on the right.

 

Canonicalization recommendations

If you reverse the whole of a simple polymer then you have a chain that could also be represented using standard monomers, just connected in the opposite order. For example:

RNA1{[35R](C)P.[35R](A)}$$$$V2.0          is the same as                    RNA1{R(A)P.R(C)}$$$$V2

We recommend that reversed monomers are only used where there is no alternative. In practice this will be when a sequence contains a combination of both forward and reverse sugars.