Hi everyone, here is the response I got from Jeff Hoch. Check it out if you want to be in the loop.

In short we'll cover introductions and probably go over some slides via Jeff on Bayesian validation WRT the PDB. If you won't be attending can you confirm so I can prepare a quick intro on your behalf or you can send me one.

Thanks!

- Jared

---------- Forwarded message ---------
From: Hoch,Jeffrey <hoch@uchc.edu>
Date: Thu, Jun 1, 2023 at 5:54 AM
Subject: Re: First meeting on Bayesian model validation (05/26)
To: Jared Sagendorf <jared.sagendorf@rcsb.org>
Cc: Baskaran,Kumaran <baskaran@uchc.edu>, Gryk,Michael R. <gryk@uchc.edu>, Eghbalnia,Hamid R. <heghbalnia@uchc.edu>, Pustovalova,Yulia <ypustovalova@uchc.edu>, Pozhidaeva,Alexandra <pozhidaeva@uchc.edu>, Courtney,Joseph M. <jcourtney@uchc.edu>

Hi Jared –

I love your “few things”! Those are a mouthful 😉. I’ll introduce our team members, and it would be great if you could summarize the discussions that you had in your initial meeting. We can offer an NMR perspective on the topics. As this will be our first all-hands meeting, I might share a small slide deck that I used to make the case to the wwPDB PIs that there is a need and an opportunity to make structure validation “more Bayesian”. I think I shared some of those slides with you all when I visited, but the deck is short and would be a good way of getting us all on the same wavelength (although it’s pretty clear were very close, if not already there).

I’m attaching a manuscript that’s currently undergoing final revisions. Please share it among your group, but not outside. A bit of context – when I took over as head of BMRB, there was already a validation task force working on revamping the validation pipeline for NMR structures. Unfortunately, although the effort is/was very well-intentioned, it retains much ad hoc and archaic language, e.g. “violations” of NMR “restraints” are dealt with in a way that simply isn’t consistent or applicable in a broader sense to any other type of empirical data. Rather than move the goalposts on the task force, I suggested they complete their work to achieve the original goal, and we would start a Bayesian initiative afresh. It serves to highlight some of the issues we will have to deal with – such as how do you convert hard upper and lower distance bounds into something that can yield a realistic distribution of errors/structures? This will be necessary for retrospective analysis of NMR structures in the PDB because distance bounds are in most cases all that was supplied by depositors. Going forward, BMRB will need to require peak tables with intensities of NOESY cross-peaks, or perhaps even raw time-domain data for NOESY experiments.

I’m cc’ing the rest of our team so you can capture their email addresses if you haven’t already. They are

Kumaran Baskaran – BMRB liaison to wwPDB and BMRB representative of the NMR VTF

Michael Gryk – associate director of BMRB and our bona fide data scientist

Hamid Eghbalnia – lead of the analytics technology development component of the NMRbox P41 grant, and our bona fide statistician/Bayesian

Yulia Pustovalova and Sasha Pozhidaeva, NMR spectroscopists par excellence, who have been utilizing AlphaFold in their workflows and trying out ways to validate computed structures based on prior knowledge.

Joseph Courtney – NMR spectroscopist and developer of the COMPASS package from Chad Rienstra’s group – COMPASS used MODELER, chemical shift prediction, and integrated some other software packages to determine protein structures from unassigned carbon-carbon correlation spectra (solid-state NMR). The use of forward-modeling of chemical shifts and unassigned peak lists to drive/constrain the structure determination was ahead of its time and very pertinent to Bayesian validation.

Looking forward to seeing you tomorrow.

Yours, Jeff

From: Jared Sagendorf <jared.sagendorf@rcsb.org>
Date: Wednesday, May 31, 2023 at 2:01 PM
To: "Hoch,Jeffrey" <hoch@uchc.edu>
Subject: Re: First meeting on Bayesian model validation (05/26)

*** Attention: This is an external email. Use caution responding, opening attachments or clicking on links. ***

Hi Jeff, just looping back to this! A few things that came up during the initial meeting with Andrej:

- Development of improved metrics for model quality

- Standardization of priors, forward functions, and likelihoods

- Establishment of a standardized vocabulary

- How to get different communities involved in all of the above decisions

In addition to any of the above, I'd be keen to learn more about what your group has been working in w.r.t. model validation, Bayesian or otherwise!

Let me know if you have any thoughts!

- Jared