Advertisement · 728 × 90

Posts by Michael Tress

The problem with that is that Spain has some of the most rancid opposition parties in the world and if (as seems likely) they get in at the next election, it's going to be pretty difficult to move to Spain.

1 month ago 0 0 0 0
Preview
‘No to war’: Sánchez doubles down after Trump threat to cut off trade with Spain PM says his country will not be complicit in growing conflict in Middle East ‘simply out of fear of reprisals from someone’

"No to war." - Spain's PM Pedro Sanchez

“You can’t respond to one illegality with another because that’s how humanity’s great disasters begin.”

www.theguardian.com/world/2026/m...

1 month ago 102 22 3 1
Video

Spanish PM Pedro Sánchez on the Iran War.

"Spain's position at this juncture is clear and forceful. It is the same position we have maintained in Ukraine and also in Gaza.

No to the breaking of international law that protects us all, especially the most defenceless - the civilian population."

1 month ago 905 261 26 25

Yup, and CD44 is not alone in that respect. I keep coming across new cases all the time.

1 month ago 1 1 0 0

There are nine inserted exons in the MANE Select/UniProtKB variants. It does fit with the GTEx data that there is some tissue specific use of the 9 exons, particularly in skin/mucous tissues. And maybe there are different combinations of exons, but that's hard to say for sure with RNASeq.

2 months ago 0 0 1 0

To be fair, I think the long UniProt/MANE Select isoform does exist. It has multiple exons, sporadic protein/transcript support and the frame of the exons is preserved across mammals. Most definitely a minor isoform though.

Number 12 is also the @appris.bsky.social principal isoform.

2 months ago 0 0 1 0
The likely pathogenic variants annotated in TMCO1 and EDNRB. A The 3D structure of the principal isoform of TMCO1 predicted by AlphaFold (Abramson et al. 2024). Hydrophobic trans-membrane regions coloured in yellow, the hydrophobic region of the predicted signal anchor (uncleaved signal peptide) in orange. B The Alpha-Fold 3D structure of EDNRB. Hydrophobic trans-membrane regions coloured in yellow, the hydrophobic region of the predicted signal peptide in orange. C A screenshot of the 5′ exon of TMCO1 in the UCSC genome browser (Raney et al. 2024) showing the amino acid sequence, methionines (from ATGs) in green. The methionine on the far right indicates the position of the upstream start codon and the cen-tral methionine the position of the main start codon. Pathogenic and likely pathogenic variants are shown as red dots below the sequence. The  likely pathogenic  frame  shift  variant  that affects  the  upstream region of TMCO1 is in the centre of the image.

The likely pathogenic variants annotated in TMCO1 and EDNRB. A The 3D structure of the principal isoform of TMCO1 predicted by AlphaFold (Abramson et al. 2024). Hydrophobic trans-membrane regions coloured in yellow, the hydrophobic region of the predicted signal anchor (uncleaved signal peptide) in orange. B The Alpha-Fold 3D structure of EDNRB. Hydrophobic trans-membrane regions coloured in yellow, the hydrophobic region of the predicted signal peptide in orange. C A screenshot of the 5′ exon of TMCO1 in the UCSC genome browser (Raney et al. 2024) showing the amino acid sequence, methionines (from ATGs) in green. The methionine on the far right indicates the position of the upstream start codon and the cen-tral methionine the position of the main start codon. Pathogenic and likely pathogenic variants are shown as red dots below the sequence. The likely pathogenic frame shift variant that affects the upstream region of TMCO1 is in the centre of the image.

Finally, this has implications for gene annotation. There is ever more evidence for these isoforms and some have already become the main isoform.

If this happens, the N-terminal ext. may get tagged with spurious pathogenic mutations (as in PUS1, TMCO1 and EDNRB) that will be hard to correct.

4 months ago 0 0 0 1

If these upstream regions are translated and their proteins degraded, it would be the definition of noisy translation.

It suggests that other N-terminally extended protein isoforms are also likely to be the result of noisy translation and are only not degraded because they are not deleterious.

4 months ago 0 0 1 0
Advertisement
Preview
Protein targeting and degradation are coupled for elimination of mislocalized proteins - Nature Membrane proteins that fail to be delivered to the endoplasmic reticulum must be rapidly degraded to avoid inappropriate aggregation and disruption of protein homeostasis. The mechanism of this proces...

There are well-documented pathways for this cytoplasmic degradation of proteins with exposed hydrophobic regions that revolve around BAG6.
www.nature.com/articles/nat...

4 months ago 0 0 1 0

Alternative isoforms with blocked signal peptides are produced and deposited in the cytoplasm, just like other N-terminally extended isoforms.

The reason that only isoforms with blocked signal peptides have no proteomics support is likely to be because they are degraded post-translation.

4 months ago 1 0 1 0
Preview
Thousands of human non-AUG extended proteoforms lack evidence of evolutionary selection among mammals - Nature Communications Analysis of a large number of Ribo-seq datasets and genomic alignments led to detection of novel non-AUG proteoforms. Unexpectedly the number of non-AUG proteoforms identified with Ribo-seq greatly ex...

There are also few or no limits on the generation of N-terminally extended protein isoforms with blocked signal peptides in translation. 13.7% and 11.2% of upstream translated regions detected in two ribosome profiling experiments were in genes with signal peptides.
www.nature.com/articles/s41...

4 months ago 0 0 1 0

... OK, I plucked that 17% out of thin air.

With @appris.bsky.social principal isoforms as the main representative of each gene, 14.7% of Ensembl reference genes have signal peptides, as do 13.8% of the 14,888 genes we had detected in our proteomics experiments.

Exported proteins are in the cell.

4 months ago 0 1 1 0
Preview
Evidence for widespread translation of 5′ untranslated regions Abstract. Ribosome profiling experiments support the translation of a range of novel human open reading frames. By contrast, most peptides from large-scale

However, out of the 170 N-terminal extensions that we detected peptides for in our paper, C1QL4 was the only one that had a blocked signal peptide.

Approximately 17% of proteins have predicted signal peptides, so where did the rest go?

More after the break.
academic.oup.com/nar/article/...

4 months ago 0 0 1 0

We did find peptides for the N-terminally extended C1QL4 protein, and the cross species conservation suggests that the potentially exposed signal peptide is no longer a problem in this family of proteins.

4 months ago 0 0 1 0
Alignment of C1Q-like proteins from human, fish and lamprey. The position of the conserved ATT start codon is marked as an isoleucine in orange, a conserved basic motif in dark red, and a conserved alanine, valine and glycine-rich region in green.

Alignment of C1Q-like proteins from human, fish and lamprey. The position of the conserved ATT start codon is marked as an isoleucine in orange, a conserved basic motif in dark red, and a conserved alanine, valine and glycine-rich region in green.

Curiously, the upstream extensions in the C1QL family would block a signal peptide, so the proteins would not be exported to the ER.

This is important because (a) the C1QL proteins would be trapped in the wrong cellular compartment and (b) the hydrophobic signal peptide could lead to aggregation.

4 months ago 1 0 1 0
Advertisement
 (A) The translated upstream region in CCDC8. The orthologous sequences are from eutherian mammals. The alignment and colouring adapted from the CodAlignView server and based on the Cactus 241-way mammalian alignments. Synonymous base changes are shown with a light blue background, non-synonymous changes that would result in conservative amino acid substitutions are shown with a dark blue background, and non-synonymous changes that would produce conservative substitutions are shown with a yellow background. The annotated downstream ATG is shown with a green background. The detected peptide is shown above the alignment in red font. The start codon is highlighted with a purple box. (B) The Alphafold (59) model for coiled coil domain containing 8 from Iberian lynx downloaded from UniProt (A0A485NL47) with the novel human N-terminal sequence painted onto the structure. The novel region coded by the translated upstream region (in yellow) completes a PNMA N-terminal RRM-like domain. (C) The Alphafold model for Helicase with zinc finger 2 (from gene HELZ2) from Pallas’ mastiff bat downloaded from UniProt (A0A7J8HGE4) with the novel human N-terminal sequence painted onto the structure. The novel region coded by the translated upstream region (in yellow) completes a globular structural domain.

(A) The translated upstream region in CCDC8. The orthologous sequences are from eutherian mammals. The alignment and colouring adapted from the CodAlignView server and based on the Cactus 241-way mammalian alignments. Synonymous base changes are shown with a light blue background, non-synonymous changes that would result in conservative amino acid substitutions are shown with a dark blue background, and non-synonymous changes that would produce conservative substitutions are shown with a yellow background. The annotated downstream ATG is shown with a green background. The detected peptide is shown above the alignment in red font. The start codon is highlighted with a purple box. (B) The Alphafold (59) model for coiled coil domain containing 8 from Iberian lynx downloaded from UniProt (A0A485NL47) with the novel human N-terminal sequence painted onto the structure. The novel region coded by the translated upstream region (in yellow) completes a PNMA N-terminal RRM-like domain. (C) The Alphafold model for Helicase with zinc finger 2 (from gene HELZ2) from Pallas’ mastiff bat downloaded from UniProt (A0A7J8HGE4) with the novel human N-terminal sequence painted onto the structure. The novel region coded by the translated upstream region (in yellow) completes a globular structural domain.

Some of these novel N-terminal extensions, especially those generated from non-AUG start codons, are highly conserved. quite clearly functional, and have not been annotated because of the unusual start codon.

Examples of these in our papers include CCDC8, HELZ2, VANGL2 and the whole C1QL family.

4 months ago 2 0 1 0
Preview
Evidence for widespread translation of 5′ untranslated regions Abstract. Ribosome profiling experiments support the translation of a range of novel human open reading frames. By contrast, most peptides from large-scale

A large majority of novel sequences detected in ribosome profiling experiments are translations from upstream start codons, often non-AUG start codons.

Many of these novel sequences, N-terminal extensions, uoORFs or uORFS, are also detected in proeomics experiments:
academic.oup.com/nar/article/...

4 months ago 0 0 1 0
The degradation of extended protein isoforms points to a misfiring translation initiation process

Over the weekend, Molecular Genetics and Genomics published our most recent paper.

Upstream start codons can produce alternative proteins with novel N-terminal amino acids that block signal peptides.

These proteins are translated but not found in any proteomics experiments.

rdcu.be/eUi7o

4 months ago 4 2 1 0

Finally managed to write a thread to go with our new paper ...

4 months ago 0 0 0 0

A quarter of the new coding genes that we validated were testis-expressed and the same number were retrovirus-derived and detected in placenta or stem cells. It is still unclear how many novel genes are still to be detected, but our results do provide clues as to where best to look for these genes.

4 months ago 0 0 0 0
Workflow from the paper showing the the numbers and types of regions not in GENCODE that had PeptideAtlas support.

Workflow from the paper showing the the numbers and types of regions not in GENCODE that had PeptideAtlas support.

In addition to the 35 new coding ORFs, we also found evidence for 279 alternative isoforms and 99 translated upstream regions. The vast majority of the upstream translations were validated by their peptides. Translation from upstream regions is more common than is currently thought (see paper above)

4 months ago 0 0 1 0
Numbers of genes in each of the three groups. In yellow the numbers of genes we found annotated as coding by GENCODE in v48 (G48). In reality we disagree with more than just 3 genes, since GENCODE annotated eight or nine copies of the POM121L1P repeats as coding along with the three paralogues of ENSG00000293661, at least two of which are highly likely to be pseudogenes.

Numbers of genes in each of the three groups. In yellow the numbers of genes we found annotated as coding by GENCODE in v48 (G48). In reality we disagree with more than just 3 genes, since GENCODE annotated eight or nine copies of the POM121L1P repeats as coding along with the three paralogues of ENSG00000293661, at least two of which are highly likely to be pseudogenes.

We report our results to GENCODE, so 10 genes were already annotated as coding prior to the paper. However, we believe that only 7 are coding. The annotation of LINC03040 and MYH16 as coding was premature and no POM121L1P repeats should have been annotated because peptides map to at least 8 regions.

4 months ago 0 0 1 0
UniProt logo

UniProt logo

None of these new genes are entirely novel because they all had to have been “discovered” at some point to be included in the PeptideAtlas search database. None of the 16 genes we believe are likely to be coding were annotated in RefSeq either, but 8 were included in the UniProtKB human proteome.

4 months ago 0 0 1 0
Advertisement
Alphafold model of LINE1 from UniProtKB

Alphafold model of LINE1 from UniProtKB

Finally peptides for 5 predicted proteins mapped to multiple regions in the genome. We believe that most of these peptides were also produced by aberrant translation. LINE 1 ORF1 would be a good example, present in hundreds of regions and with more than 50 peptides in cancers.

4 months ago 1 1 1 0
Preview
Evidence for widespread translation of 5′ untranslated regions Abstract. Ribosome profiling experiments support the translation of a range of novel human open reading frames. By contrast, most peptides from large-scale

This evidence for aberrant translation reinforces the evidence that dysregulated cells can non-functional proteins in sufficient quantities to be detected in MS experiments. We also found this in our previous paper on translation from upstream start codons:

academic.oup.com/nar/article/...

4 months ago 0 0 1 0

What about the other 19 potential coding genes with validated peptides? Well, 14 are most likely to be non-coding regions undergoing aberrant translation. Ten are known to be cancer relevant and for 12 reading frames are not preserved beyond human. All peptides are restricted to cancer or cell lines

4 months ago 0 0 1 0
On the left the breakdown of the origen of the 16 likely coding genes - 10 gene duplications (GD), six retroviral ORFs (TE) and none de novo. Contrast this with the other 19 regions with peptide support

On the left the breakdown of the origen of the 16 likely coding genes - 10 gene duplications (GD), six retroviral ORFs (TE) and none de novo. Contrast this with the other 19 regions with peptide support

None of the 16 the genes that we found peptide evidence for had evolved ab initio within primates.

4 months ago 0 0 1 0
Alignment of the three placenta-expressed ERV gag ORFs supported by PeptideAtlas peptides. Residues are coloured by structural domains from the related AlphaFold model of ERVFRD-2 from the paper.

Alignment of the three placenta-expressed ERV gag ORFs supported by PeptideAtlas peptides. Residues are coloured by structural domains from the related AlphaFold model of ERVFRD-2 from the paper.

Another 6 coding genes derived from retroviruses, and 3 of these were detected exclusively in placenta. This is remarkable because up to now all well-known co-opted retroviral genes in human placenta were derived from env ORFs. All three PeptideAtlas-supported novel ORFs were ERV gag ORFs.

4 months ago 2 1 1 0
The relative positions on chromosome X of the four ETDA paralogues (in blue) along with the four paralogues of ENSG00000293661 (yellow).

The relative positions on chromosome X of the four ETDA paralogues (in blue) along with the four paralogues of ENSG00000293661 (yellow).

Like several other gene duplications, Trembl entry Q3ZM62 (now ENSG00000293661) is expressed in testis. It is a eutherian paralogue of ETDA and this pair of genes have three more copies on chromosome X in human. However, there is no evidence to suggest that any of the other six genes are coding.

4 months ago 0 0 1 0
Predicted structures from AlphaFold for genes ZNF840P and CFAP144P1 with the detected PeptideAtlas peptides mapped in yellow. On the right, and analysis of one of the CFAP144P1 peptides by Vseq.

Predicted structures from AlphaFold for genes ZNF840P and CFAP144P1 with the detected PeptideAtlas peptides mapped in yellow. On the right, and analysis of one of the CFAP144P1 peptides by Vseq.

The other 14 genes can be split into two groups. Eight derived from gene duplications. Many of these have undergone considerable changes and may have been pseudogenes prior to gaining novel function. These genes include C5orf60 (now SPATA31J1), CFAP144P1, MSL3P1 (now MSL3B ) and ZNF840P.

4 months ago 1 1 1 0