Blobs and CHEM{*}
HELM has two ways to describe material whose composition is unknown .
You can use the CHEM polymer type:
CHEM1{*}$$$$V2.0
Or the blob
BLOB1{Bead}$$$$V2.0
The HELM specification does not constrain your choice, so it is worth comparing the two.
Concepts
CHEM monomers are intended to be an organic small molecules where the structure does not conform to any of the defined polymer types. It is there to cover drug antibody conjugates and similar situations.
When the CHEM monomer is entirely unknown, you can use * to represent it, however the expectation is that it is still an organic small(ish) molecule.
BLOBS are used for particles where the material involved is less organic, for example a metal like gold or a non-biological polymer such as polystyrene. A blob is often a material with a biopolymer adhered to it but can be anything large, including a short-hand (non-structure containing) way of adding a whole antibody to a structure.
Key to the blob type is the name and annotation. The HELM tools have two default types: ‘Bead’ and ‘Gold Particle’, but these can be extended on request. The vocabulary is constrained by the tools and the description is very clear in the HELM string.
Annotation
Both CHEM{*} and BLOB can be annotated either in the first or third section of the HELM string. First section annotations are immediately after the monomer. The Blob has a type as well as the annotation.
BLOB1{Bead}"Annotation"$$$$V2.0
CHEM1{*"Annotation"}$$$$V2.0
A BLOB is not a monomer, so it is not possible to annotate within the {}s. A CHEM{*} can be annotated either as a monomer within the brackets or afterwards like the BLOB notation above. In practice these amount to the same thing, but the display will be slightly different.
CHEM1{*"Annotation"}$$$$V2.0
CHEM1{*}"Annotation"$$$$V2.0
First section annotations should be simple and fairly short and apply to a specific part of the molecule. If you want to add complex, structured information to the whole polymer, use the third section annotation. You can include information in JSON format in the third section, so it provides considerable scope for expansion.
PEPTIDE1{C.C.C.C.C.C}|BLOB1{Gold Particle}”Au10, Diameter:10nm”
$PEPTIDE,BLOB1,C:R3‑?:?
$G1(PEPTIDE1:20-34+BLOB1)
${“Name”:”Gold particle conjugated with peptides”,”Load”:26}$
Connection
Since the structure is unknown the connection must also be unknown. Therefore, neither a BLOB or a CHEM{*} can have numbered R groups. The monomer itself can be numbered, although a blob will always have the number 1, but the R group cannot be defined since there is no known monomer structure to connect it to.
So, this is not allowed:
BLOB1{Bead}|PEPTIDE1{A.G.T}$BLOB1,PEPTIDE1,1:R1-?:?$$$V2.0
But this is fine:
BLOB1{Bead}|PEPTIDE1{A.G.T}$BLOB1,PEPTIDE1,1:?-?:?$$$V2.0
Brackets
You can't have CHEMs or BLOBs inside brackets.
If you only have the CHEM or BLOB inside the bracket, you don't know how it is connected to itself.
If it is connected to anything else then you have more than one simple polymer. You cannot define repeating groups across multiple simple polymers.
Summary
Blobs and CHEM{*} behave in a similar way, but they are used in different contexts. The project is interested in extending the blob name vocabulary, so please let us know if you have any suggestions.