Question about compressing Uniprot90
I am trying to compress the Uniprot90 FASTA file database into a binary file using convert() function as provided in examples for the MODELLER. However, my compressed file ends up being even larger than the uncompressed original one. What may be the problem here? Here is the Python code which I a using with MODELLER v9.7.
from modeller import *
log.verbose() env = environ()
sdb = sequence_db(env) sdb.convert(seq_database_file='/gne/research/data/bioinfo/modeller/db/uniprot90', seq_database_format='FASTA', chains_list='ALL', minmax_db_seq_len=[30, 4000], clean_sequences=True, outfile='/gne/research/data/bioinfo/modeller/db/uniprot90_pruned.bin')
Kind regards and thank you, Dimitrije Jevremovic
On 07/11/2012 02:46 PM, Dimitrije Jevremovic wrote: > I am trying to compress the Uniprot90 FASTA file database into a > binary file using convert() function as provided in examples for the > MODELLER. However, my compressed file ends up being even larger than > the uncompressed original one.
Binary files are faster to access than text files. They are not necessarily smaller (for one thing, the indexes used for random-access lookup of sequences take up space).
Ben Webb, Modeller Caretaker
participants (2)
-
Dimitrije Jevremovic
-
Modeller Caretaker