Looking at the PDB file it has 316 residues then it had a molecule of
FAD tacked on at a position designated 400.
...followed by a bunch of waters from residue 500 to 945, followed by an
SO4 at 440. (There are actually more than 316 residues, too.)
Modeller always reads PDB residues in order, so if the ligand is at the
end of the PDB file, you put the blk residue at the end of the sequence
in the alignment too. If you turn env.io.hetatm on to True, all of the
HETATMs in the model (at least in the range specified by model_segment)
will be read, so they must all be listed in your alignment.
So when I align this to some target like below (cropped).
--------------------------MNGLETHNTRLCIVGSGPAAHTAAIYAARAELKP
LLFEGWMANDIAPGGQLTTTTDVENFPGFPEGILGVELTDKFRKQSERFGTTIFTETVTK
VDFSSKPFKLFTDS---KAILADAVILAIGAVAKRLSFVGSGEVLGGFWNRGISACAVCD
GAAPIFRNKPLAVIGGGDSAMEEANFLTKYGSKVYIIHRRDAFRASKIMQQRALSNPKID
VIWNSSVVEAYGDGERDVLGGLKVKNVVTGDVSDLKVSGLFFAIGHEPATKFLDGGVELD
SDGYVVTKPGTTQTSVPGVFAAGDVQDKKYRQAITAAGTGCMAALDAEHYLQEIGSQEGK
SD-
*
P1;fake1
DASGLSVAAAATLSQKSTPYYQSEIHTIGKRRMHSKVVIIGSGPAAHTAAIYLARAELKP
VLYEGFMANGVAAGGQLTTTTEVENFPGFPEAVTGQELMDKMRAQSERFGTVIVSETVGK
LDLSKRPFEYSTEWSPDTVMTADAVILATGASARRLGLPGED----KYWQNGISACAVCD
GAVPIFRNKPLVVIGGGDSAAEEAIFLTKYGSHVTVLVRRDKLRASSIMARRLLAN----
------------------------------------------------------------
-------------KKVTGLFAAGDVQDKRYRQAITSAGTGCMAALDAEKYLEELEDEQAD
GKL
*
Where should I stick the fad? At the end?
That's not a complete alignment file, so I can't tell which is supposed
to be the 1vdc sequence. But the true 1vdc sequence can be obtained by
Modeller using the script at http://salilab.org/modeller/FAQ.html#18(merely modified by setting env.io.hetatm = True):
Since the FAD comes after the regular amino acids in the PDB file, the
blk residue (.) comes immediately after the regular sequence. You can
also see a # residue there - that's the SO4. You can see which HETATMs
Modeller has codes for by looking at modlib/restyp.lib, but if in doubt
use blk (.) since that'll match everything.
You will often see a chain break (/) immediately preceding blk residues
in alignments. That's only necessary if you want to force the ligands to
have a different chain ID to the amino acids. (If you want them in the
same chain, leave out the chain break.)
How many blk characters
should I tack on? 1 because there is only one molecule?
HETATM residues are treated in exactly the same way as ATOM residues, so
one character per PDB residue.
What you get in your model, of course, depends on what is in your target
sequence. For example, you may want to build a model containing FAD but
not the SO4. In this case, you would align a blk residue in the model to
the corresponding FAD blk residue in the template, but align a gap to
the SO4.