The number of sequences in CHAINS_all.seq
Dear all,
I found an unusual thing... Why is there so many duplicated sequences in modlib/CHAINS_all.seq?
$ cat CHAINS_all.seq | grep "^>" | wc -l 118200
$ cat CHAINS_all.seq | grep "^>" | sort | uniq | wc -l 59796
Thank you very much!
I mean the file shipped with modeller8v2... :)
> I found an unusual thing... > Why is there so many duplicated sequences in modlib/CHAINS_all.seq? > > $ cat CHAINS_all.seq | grep "^>" | wc -l > 118200 > > $ cat CHAINS_all.seq | grep "^>" | sort | uniq | wc -l > 59796 >
Zhiqiang Ye wrote: > I found an unusual thing... > Why is there so many duplicated sequences in modlib/CHAINS_all.seq? > > $ cat CHAINS_all.seq | grep "^>" | wc -l > 118200 > > $ cat CHAINS_all.seq | grep "^>" | sort | uniq | wc -l > 59796
That file is deprecated, and contains old sequences. You should use the updated chains databases - available from http://salilab.org/modeller/supplemental.html - if you need these databases. Future releases of Modeller will not include the CHAINS_* files.
Ben Webb, Modeller Caretaker
participants (2)
-
Modeller Caretaker
-
Zhiqiang Ye