Alex Solivais (@alexandersol) Bsky

GitHub - smith-chem-wisc/MetaMorpheus: Proteomics search software with integrated calibration, PTM discovery, bottom-up, top-down and LFQ capabilities Proteomics search software with integrated calibration, PTM discovery, bottom-up, top-down and LFQ capabilities - smith-chem-wisc/MetaMorpheus

Free copies of MetaMorpheus available for an unlimited time. Get yours now before they are all gone!

github.com/smith-chem-w...

1 year ago 25 9 2 0

Did you every consider what effect in-source oxidation might have on the mass shifted decoys? For z = 2, an 8*1.0005 Th mass shift is about the same as an oxidation (+- 10-20ppm)

1 year ago 2 0 1 0

Very cool figure! I understand that N*1.0005 mass shifts are used due to the distribution of possible peptide masses.

1 year ago 1 0 1 0

We only need to assume that an incorrect transfer for a given donor peak is equally likely to involve the predicted RT as it is to involve the random RT.

1 year ago 2 0 0 0

That's a good point! There's a chance (I would argue a small chance) that you could randomly choose an RT and end up with an accurate match.

However, we don't rely on the assumption that every single random-RT peptide is an incorrect match.

1 year ago 2 0 2 0

The and Kall use a 5*1.0005 Th shift because they expect there to be peptides at the shifted m/z.

"The idea behind this offset is that the density, w.r.t. precursor m/z and retention time, of MS1 features is approximately the same in this offset region."

1 year ago 1 0 0 0

A match is false if the two peaks were generated by different analytes. We can't really "guarantee" that a match is false. We only assume that an incorrect match for a given donor peak is equally likely to involve the predicted RT as it is to involve the random RT

1 year ago 1 0 0 0

I don't think those assumptions are necessary (or true). When we investigated potential features, we found that XIC shape doesn't have a lot of predictive power. Most peptides have approximately Gaussian peaks. Likewise. the isotopic distribution for peptides with similar masses don't really differ

1 year ago 2 0 1 0

"In this case 11 Da was chosen as a randomly selected integer value which differs from any known common post-translational modification. Indeed the number of matches does not vary significantly as long as the mass shift value stays an integer" - Petyuk et al., 2007

1 year ago 2 0 0 0

In the original work from PNNL, an 11 Da shift was used as this mass difference doesn't correspond to any common PTMS or tags. However, if you look at the human proteome, there are a lot of peptides that are 11 Da away from one another.

1 year ago 1 0 3 0

We would love to look at more single-cell datasets, but our method for FDP estimation only works with specialized "two-proteome" experiments. It would be great if our method catches on and more of these datasets are generated!

1 year ago 2 0 0 0

The randomized RT we use when trying to locate a decoy peak has to be different from the RT of the target donor peptide. This ensures we don't select the target peak twice. Then, at the end of the peak-matching procedure, we go through all peaks and make sure none were assigned to multiple peptides

1 year ago 3 0 0 0

Improved detection of differentially abundant proteins through FDR-control of peptide-identity-propagation Quantitative analysis of proteomics data frequently employs peptide-identity-propagation (PIP) — also known as match-between-runs (MBR) — to increase the number of peptides quantified in a given LC-MS/MS experiment. PIP can routinely account for up to 40% of all quantitative results, with that proportion rising as high as 75% in single-cell proteomics. Therefore, a significant concern for any PIP method is the possibility of false discoveries: errors that result in peptides being quantified incorrectly. Although several tools for label-free quantification (LFQ) claim to control the false discovery rate (FDR) of PIP, these claims cannot be validated as there is currently no accepted method to assess the accuracy of the stated FDR. We present a method for FDR control of PIP, called “PIP-ECHO” (PIP Error Control via Hybrid cOmpetition) and devise a rigorous protocol for evaluating FDR control of any PIP method. Using three different datasets, we evaluate PIP-ECHO alongside the PIP procedures implemented by FlashLFQ, IonQuant, and MaxQuant. These analyses show that PIP-ECHO can accurately control the FDR of PIP at 1% across multiple datasets. Only PIP-ECHO was able to control the FDR in data with injected sample size equivalent to a single-cell dataset. The three other methods fail to control the FDR at 1%, yielding false discovery proportions ranging from 2–6%. We demonstrate the practical implications of this work by performing differential expression analyses on spike-in datasets, where different known amounts of yeast or E. coli peptides are added to a constant background of HeLa cell lysate peptides. In this setting, PIP-ECHO increases both the accuracy and sensitivity of differential expression analysis: our implementation of PIP-ECHO within FlashLFQ enables the detection of 53% more differentially abundant proteins than MaxQuant and 146% more than IonQuant in the spike-in dataset. ### Competing Interest Statement The authors have declared no competing interest.

Re-posting our new preprint on match between runs. This multi-lab effort (Keich, Noble, Payne & Smith) led by Alex Solivais should be of interest to anyone doing LFQ. We describe here how to control FDR in LFQ and provide the open source software to do it.
www.biorxiv.org/content/10.1...

1 year ago 31 15 2 2

It's easier to find decoys, because RT is a more continuous variable. We consider one continuous RT range, instead of looking at discrete points in m/z space and iterating until we find something.

1 year ago 3 0 1 0

Posts by Alex Solivais