run clustering.py on multiple stat files

older
IMP bug - decorating floppy bodies...

מירב בריטברד

27 Nov 2017 27 Nov '17

12:59 a.m.

Hi, I ran modeling.py multiple times and created different output folders that each of them contain a stat file. Now I want to run clustering.py on all the stat files in the different folders, how can I do it? Thank, Merav

Attachments:

attachment.htm (text/html — 277 bytes)

Show replies by date

Ben Webb

28 Nov 28 Nov

1:25 p.m.

On 11/27/17 12:59 AM, מירב בריטברד wrote: > I ran modeling.py multiple times and created different output folders > that each of them contain a stat file. Now I want to run clustering.py > on all the stat files in the different folders, how can I do it?

Assuming you mean you want to merge multiple runs into one large ensemble and cluster the whole thing using the AnalysisReplicaExchange0 macro, just set the merge_directories argument to a list of all the directories containing your stat files (and RMF trajectories).

Ben

-- ben@salilab.org https://salilab.org/~ben/ "It is a capital mistake to theorize before one has data." - Sir Arthur Conan Doyle

מירב בריטברד

29 Nov 29 Nov

4:13 a.m.

Thanks! I have another question, I have 1000 different runs and some of them is better from others, is there a efficient way to choose only the runs with the best scores? and then use AnalysisReplicaExchange0 macro on a reduced list with the best scores?

On Tue, Nov 28, 2017 at 11:25 PM, Ben Webb ben@salilab.org wrote:

> On 11/27/17 12:59 AM, מירב בריטברד wrote: > >> I ran modeling.py multiple times and created different output folders >> that each of them contain a stat file. Now I want to run clustering.py on >> all the stat files in the different folders, how can I do it? >> > > Assuming you mean you want to merge multiple runs into one large ensemble > and cluster the whole thing using the AnalysisReplicaExchange0 macro, just > set the merge_directories argument to a list of all the directories > containing your stat files (and RMF trajectories). > > Ben > -- > ben@salilab.org https://salilab.org/~ben/ > "It is a capital mistake to theorize before one has data." > - Sir Arthur Conan Doyle >

Ben Webb

30 Nov 30 Nov

11:23 a.m.

On 11/29/17 4:13 AM, מירב בריטברד wrote: > Thanks! I have another question, I have 1000 different runs and some of > them is better from others, is there a efficient way to choose only the > runs with the best scores? > and then use AnalysisReplicaExchange0 macro on a reduced list with the > best scores?

I don't think that will buy you anything - the clustering should do a better job given the entire ensemble. If you want to discard entire runs, you'd need to make that decision yourself. You can certainly set the perfiltervalue though to quickly discard the worst-scoring models.

Ben

-- ben@salilab.org https://salilab.org/~ben/ "It is a capital mistake to theorize before one has data." - Sir Arthur Conan Doyle

2578

Age (days ago)

2581

Last active (days ago)

List overview

Download

3 comments

2 participants

tags (0)

participants (2)

Ben Webb
מירב בריטברד