Proteins related to purine biosynthic proteins, the big picture.

I want to put all of those protein domains in the previous posts into a larger context. Some of this I mentioned in previous posts in this series but they are worth mentioning again. I post a part of the big drawing to start.

Nucleotides do a set of things. I hope I have covered them all and there aren’t any things they do that I missed.

  • Nucleotides are made, in specific biochemical pathways, and separated into purine (AG) and pyrimidine (UCdT) pathways. These pathways merge with thiamine and histidine pathways.
  • Nucleotides are genomes, messenger RNAs, ribosomal, and transfer RNAs (and other associated RNAs). They base pair through polar interactions between their carbonyls (C=O) and amines (-NH3/-N=).
  • Nucleotides are the biosynthic source for other cofactors like vitamins. The output of purine biosynthesis becomes folate, flavin…
  • Nucleotides are dispensers for phosphate, and not just ATP. There are thematic differences the kinds of things ATP, GTP, and CTP dispense for.
  • Nucleotides are handles and escort molecules for cofactors including many vitamins, membrane components and some other molecules. And handles for amino acids during translation in tRNA.
  • Nucleotides along with amino acids are the substance of transcription and translation. The base pairing becomes patterns of amino acid sequence that determine protein sequence.
  • Nucleotides are signalling molecules, often when cyclized, alone and in multiples.
  • Nucleotides act as replaceable groups like in sulfur transfer and Adenylylation.

Hydrothermal systems, minerals and ocean chemistry seem to somehow expand into this.

Nucleotide biosynthesis.

First there is the variable of the biochemical pathways themselves. What was the early extent of vent chemistry generated nucleotides? Did the reactions race straight to inosine or AMP? Or did things accumulate at points like the first ring closure? Or glycine binding? Did glycine binding repeat in an early polypeptie? Maybe there was a population of 5-membered pyrrole (5-membered carbon and nitrogen ring) ring creatures that could catalyze things towards inosine?

Looking at part of the big drawing again the similarities between thiamine thiazole biosynthesis and stage 1 of purine biosynthesis are interesting. A 5-carbon chain is made from a 2C and a 3C (DXP), and it is combined with a modified glycine and sulfur. And right after purine first ring closure that molecule can be made into the second part of thiamine, AIR becomes HMP. There’s even an imine on glycine in both cases (C=N). There’s this interesting imine on aspartate in niacin biosynthesis too.

And where do the pyrimidines fit in with their ring synthesis and then attachment to ribose? Was there a point where pyrroles and pyridines (5- and 6-membered carbon and nitrogen ring) coexisted on ribose-phosphate chains? How does niacin fit in with ring creation and then ribose binding, and imine on aspartate?

And how do I think about the way the purines and pyrimidines have opposite relationships with ribose? Purines being built onto ribose and pyrimidines built and then put onto ribose. And niacin, NAD(P), built very much like pyrimidines, does that matter?

Genomes, RNA, and base pairing.

This involves interactions between strands of the same or different nucleotides. In DNA genomes and RNAs these are attractions based on polarity, but what about other interactions as one goes back? Can we count on today’s sequences being AGCU? (I believe it was once just RNA genomes) Maybe they blur down into base pairing between other 5 and 6 membered rings? Intermediates in nucleotide biosynthesis?

Was chemistry between strands a thing? Are genomes, transcription and translation just massive off-loading of functions from an ancestral molecule that did more than sit there as an information source analog? Did it make itself through chemistry it participated in as something that looked like a purine intermediate like AIR?

Nucleotides as biosynthic sources of amino acids and cofactors.

Nucleotides are the biosynthic source of the amino acid histidine, and cofactors like folate, riboflavin, and molybdenum cofactor. GTP is specifically what is dismantled for the cofactors, and ATP for histidine.

Maybe there are clues for the origin of ribose in how it is deconstructed in these things and tryptophan? PRPP is the ribose source in tryptophan, not nucleotides, but that’s part of the story too. After completing this more complicated drawing involving nucleotide metabolism I think about  being able to peel back later evolving parts of metabolism and trying to find connections to more ancient parts.

Nucleotides as phosphate dispensers.

ATP is the stereotype for dispensing phosphate for active site chemistry and molecular movement. But GTP does it too and specifically for translation and cytoskeleton proteins like microfilaments and microtubules, things that could be useful for getting around in cells or mineral systems. CTP sometimes drives cysteine attachment to make coenzyme-A and it plays a phosphate dispenser role in cell division (ParB, chromosomal partitioning), hydrolysis indicating a completed step. Otherwise all of the NTPs and dNTPs can donate phosphate to one another in interconversions.

Nucleotides as handles.

ATP acts as a handle for many cofactors like niacin, riboflavin, coenzyme-A, S-adenosyl-methionine, and thiamine. GTP and CTP acts like a handle for parts of molybdenum cofactor. All of the bases act as handles for kinds of carbohydrates ,UDP-glucose has the feel of a most ancient part, and CDP has a theme involving membrane components.

It’s interesting that deoxythimidine handles deoxycarbohydrates. A theme carried over. It makes me wonder about themes with ADP, UDP, and GDP as carbohydrate handles.

Nucleotides as the substance of transcription and translation.

This is what biology class covers. The actualized genetic code. The process where base pairing in DNA and/or RNA becomes specific amino acids joined in a polypeptide chain. Like the protein subdomains I listed for every protein on the big drawing. tRNA is another way that nucleotides are handles, but for amino acids.

Maybe it all goes to polyglycine originally. Glycine is involved in making purines, the thiamine 5-membered thiazole ring, and every other amino acid looks like a modified glycine. Which leads to another question, could other amino acids have been made right on early polyglycine? Or were they made separately like today and added to a polypeptide with glycine?

Nucleotides as signalling molecules.

Nucleotides are part of rapid intracellular communication networks as cyclized forms. Single nucleotides can be cyclized at the phosphate, or cyclic dinucleotides connected at their phosphates. And other forms with extra phosphates in different places that act as signals inside of a cell.

All that phosphate. Proteins detect specific phosphates and numbers of phosphates in specific locations in so many ways I have no problem with the idea that early biology emphasized it in protein domains. Maybe phosphate was in the very first parts of selection.

Nucleotides as replaceable chemical groups.

Here is the molybdenum cofactor part of the big drawing. Nucleotides are part of the logic of sulfur transfer to sulfur carrier proteins (red arrows and 1-3), here MoaD, 2 of which transfer sulfur to the molecule by replacing AMP (an adenylylation) added by MoaB. To a double glycine residue which is very interesting since I think things go to polyglycine somewhere in history. This also happens with ThiS in thiamine biosynthesis and I want to add that to the big drawing at some point.

This is an area that is largely unexplored by me like nucleotides as signalling molecules and nucleotides as handles outside of translation. It’s to the point where I may need a separate big drawing from everything nucleotides do in addition to the drawing that shows how they are made and what they are linked to in biosynthesis.

And that’s it for the tour of nucleotide biosynthesis, the proteins that carry it out, and their relationships with the rest of metabolism. I hope this has been interesting and others now find these origin of life puzzle pieces as interesting as I do. I may be short on answers but it’s about finding clues to how it may have happened. Being able to imagine how it could have looked given that we collectively probably won’t know for sure in our lifetimes. At least there is substance for the imagination.

General Biochemical Patterns in Purine Biosynthesis and Protein Relatives. Part 5

So this started with ribose phosphate, glutamine derived ammonia and formate forming a 5-membered ring, then bicarbonate bound a nitrogen and moved to a carbon, aspartate bound the bicarbonate and left its nitrogen behind, another formate bound the same nitrogen as the bicarbonate did before moving and closed a 6-membered ring by binding to the nitrogen. We have a complete purine, IMP.

Red arrows show where the purines join the thiamine and histidine pathways.

But IMP is not in DNA or RNA unless made by editing enzymes so this isn’t done. This last stage of purine biosynthesis, stage 5, is 2 parallel sets of 2 reactions that give AMP and GMP. These are equivalent with respect to the origin of life no matter how much I want to make assumptions about which came first. IMP came first if anything, but the surrounding environment may have pushed things towards AMP or GMP (or XMP, below, or things that don’t exist anymore).

I’ll start with AMP synthesis to get the hard part out of the way. The hard part are the P-loops that were mentioned in my last post as one of the oldest groups. I like to write down all of the things and sift out patterns.

Stage 5a.1: aspartate binds again.

Again aspartate is bound by it’s nitrogen, this time to the carbonyl carbon (C=O), and again just a phosphate is needed. More reasons to think about an aspartate accumulation step in molecular evolution. Evolution made things fun and complicated by having a GTP used for this reaction.

The protein that does this is in the A+B3L architecture bin, the P-loop domains-like X bin, and P-loop domains-related homology bin. Finally in the P-loop containing nucleoside triphosphate hydrolases topology bin we find PurA as adenylosuccinate synthetase.

When I click into that topology bin I see why this is the hard one. There are many many P-loop nucleoside triphosphate hydrolases (things that separate parts of nucleotides with water, the phosphates here). It’s worse than the Rossmann domains. And guess what, they might be related.

On the emergence of P-Loop NTPase and Rossmann enzymes from a Beta-Alpha-Beta ancestral fragment. Longo 2020

And that’s what happens when something is an old protein domain, there are lots of kinds in lots of proteins. The following paper is suggestive about the oldest purpose of the P-loops, phosphate control. They looked at P-loops from many proteins and created a minimal fragment that maintains its activity.

They got it down to 8 amino acid residues.

The graphs are tests of activity of the fragment and mutants that demonstrate lack of activity.

Simple yet functional phosphate-loop proteins. Romero Romero 2018

Another by the group above looking at Rossmans and P-loops looked at the most ancient domains and found phosphate interaction to be the most basic theme.

Short and simple sequences favored the emergence of N-helix phospho-ligand binding sites in the first enzymes. Longo 2020

There’s little more I can do here, phosphate is so basic and widespread in biology that I’ve little trouble believing every use of phosphate can be in these proteins, active site chemistry, mechanical changes to a protein, changing charge, creating or blocking a binding site…

Still some things occur so often on ECOD that they have repeat family bins with increasing numbers and that is worth mentioning. Sulfotransferases show up repeatedly. So do helicases. Bits of molecular motors dynamin, kinesin, myosin… Iron transport and Iron-sulfur cluster binding proteins.

Polyphosphate kinase 2, Ppk2 seems significant since a store of phosphate is needed for all of this. And there are ATP synthase components.

Formyltetrahydrofolate synthase, FTHFS is what cells use to make RNA since it creates the 10-formyl tetrahydrofolate used to make purines.

It took to almost the end of purine biosynthesis for something truly ancient to show up, and that’s ok because all the rest of purine biosynthesis just needed a single phosphate to work. I can imagine primordial phosphate dispensers accumulating and releasing phosphate in ways where that phosphate can do work without needing a compartment in a protein. A mineral compartment could work fine.

Stage 5a.2: aspartate leaves as fumarate. AMP is synthesized.

It’s PurB again, the same as the last post. That was quick, life was lazy and used the same protein to do it again. It’s worth noting that PurB does not remove aspartate as fumerate in arginine biosynthesis (arginosuccinate).

Stage 5b.1: adding water to IMP.

GuaB has 2 domains. The general reaction is adding water to IMP so it has 2 carbonyl groups, making XMP. Xanthosine monophosphate. What is interesting is that it uses niacin which is based on aspartate and has a biosynthesis pathway similar to pyrimidines. The ring is made and then it is put on ribose to make a proton/electron dispenser/acceptor.

TIM barrels

The first domain is in the A/B Barrels architecture bin, the TIM Alpha Beta barrel X group, and TIM Barrels for homology and topology. This is another hard group because there are so many. This is the 3rd ancient family in these posts after the Rossmans and P-loops. A barrel of 8 repeating A/B segments. Once evolution hit on this it did a lot with it.

TIM barrels confuse me. According to one source I found up to 15 different reactions are catalyzed by this family and it’s the shape that seems conserved.

From Meroz 2007

Roles of Specific Peptides in Enzymes. Meroz 2007

Evolution is thought to emphasize the progression from a quarter to a half barrel.

Hidden Sequence Repeats: Additional Evidence for the Origin of TIM-Barrel Family. Ji 2016

So what does shape benefit here? The best I can think of is concentration of things near active sites in that central pore, or multiplication of active sites.

Skip this paragraph if you want to avoid a big list. Significant proteins include: Methylenetetrahydrofolate reductase (makes folate with a 5-methyl), RNAseP, ThiC (makes the big ring of thiamine from purine AIR), Dihydrodipicolinate synthetase (lysine biosynthesis), Quinolinate phosphoribosyl transferase (makes niacin), pyruvate kinase (last step of glycolysis, does a substrate level phosphorylation), glutamate synthase, enolase (glycolysis), Nicotinate phosphoribosyltransferase (adds ribose to the niacin ring), Fructose-bisphosphate aldolase (glycolysis), DAHP synthetase (first step of aromatic amino acid biosynthesis), Type I 3-dehydroquinase (aromatic amino acid biosynthesis), Triosephosphate isomerase (glycolysis, the “TIM” in “TIM barrel”), pterin binding (folate-like), RuBisco large chain (modern photosynthesis carbon acquisition), PdxJ (makes pyridoxal phosphate), ThiG (makes the thiamine 5-membered thiazole ring), Ribulose-phosphate 3 epimerase (makes xylulose phosphate), CO dehydrogenase/acetyl-CoA synthase delta subunit, and lots of things that bind S-adenosyl-methionine.

CBS domain

There is a second domain buried inside of the TIM barrels. This domain is in the “A+B duplicates and obligate multimers” architecture bin, and then the “CBS domain” X, homology and topology bins. According to Interpro CBS domains pair to form single globular domains (called Bateman domains) and GuaB naturally has 2. These domains seem to bind things containing adenylate (adenine) making this a potential regulatory site or a site that binds the adenine of the NAD (niacin) the protein uses to move electrons around.

Stage 5b.2: glutamine provides another ammonia.

GuaA, GMP synthase, works like the other enzymes providing ammonia via glutamine. The ammonia pops off and the phosphate from ATP is used to bind it to the carbon that binds the original carbonyl from IMP. This protein has 3 subdomains.

GATase domain.

This one has been seen before. It’s the GATase1 domain in the Flavodoxin-like bins covered previously. So I’ll refer back to that section.

The next subdomain is new. It’s in the A/B3LS architecture bin, and the “HUP-domain-like” X, homology and topology bins. HUP domains seem to have to do with hydrolysis of the alpha-beta bond of ATP which would leave AMP and diphosphate (the three phosphates of NTPs are labeled alpha, beta, and gamma counting from ribose).

Here’s another list. HUP domains are found in: adenylyl and cytidyl transferases (bind adenosine and cytosine to things, including making niacin). NAD synthase. Glutamine, methionine, lysine, tyrosine, tryptophan, arginine, leucine, isoleucine and valine tRNA synthases. Pantothenate synthase (coenzyme-A), FAD synthase, arginosuccinate synthase, a sacrificial sulfur transferase (LarE, the protein has to be regenerated before it works again), universal stress protein (USP), an Na/Cl/K cotransporter, asparagine synthase, ThiI (thiamine biosynthesis), electron transfer flavoprotein domain, Phosphoadenosine phosphosulfate reductase (PAPS), ATP-sulfurylase and (makes APS which becomes PAPS).

Alpha-lytic protease prodomain-like.

Alpha-lytic protease prodomain is a part of a protein that cuts proteins (protease) that has to be removed before it is active (prodomain). But beyond that GuaA might not be related to anything else in there, it’s in its own homology bin all by itself and is listed as a dimerization domain (so 2 of these proteins interact) making this part done.

And that is it for part 5. Hopefully when I was looking in bins I didn’t ignore anything important. I have 1 more post that puts all of this in a larger context.

General Biochemical Patterns in Purine Biosynthesis and Protein Relatives. Part 4

Ammonia, glycine, and formate formed a ring on ribose-P, bicarbonate bound a nitrogen and was moved over 2 atoms. And little but phosphate was required for all these steps (externally, the protein still affects the process).

It’s worth mentioning here that this is another place where purine biosynthesis splits and joins another metabolic pathway. Previously thiamine and B12 joined at the first ring closure. Here aspartate is going to bind the nitrogen that used to be a glycine carbonyl, and then most of that aspartate will be removed as fumarate effectively making this a 2-step equivalent to glutamine as an ammonia dispenser. Aspartate as primordial ammonia dispenser?

After the removal of aspartate the product AICAR is a product of histidine biosynthesis as the big drawing shows, or should show if I remembered to draw an arrow (above). That explains why most of the molecule made with ATP and PRPP suddenly disappeared. Histidine is considered to be a late evolving amino acid, and AICAR feeds into purine biosynthesis instead of purine biosynthesis feeding into the connected metabolic pathway. Maybe it’s as simple as early purine biosynthesis being old and late purine biosynthesis being younger? Maybe one can just read age into the pathways somewhere.

Stage 3.1: addition of aspartate.

This is a protein domain family that I’ve already covered in general, but it has a few relations worth mentioning.  SAICAR synthase, PurC, is related to the ATPgrasp domain, and a bunch of serine/threonine/tyrosine protein kinases. And phosphate is used in attaching the nitrogen of aspartate to the bicarbonate moved in the previous step by PurE1 (the phosphate attaches to CAIR by the bicarbonate first, and is replaced by aspartate).

While the ATPgrasp and protein kinases are topology siblings, PurC has some family siblings and they are all inositol kinases.

Inositol. Every hydroxal (-O) can be phosphorylated.

Phosphatidylinositol 5-phosphate 4-kinase type-2 alpha, Inositol-trisphosphate 3-kinase A, and Inositol-pentakisphosphate 2-kinase. This one may not be that origins related as inositol is not used by many prokaryotes. It’s often a membrane lipid modification, and the various forms of phosphorylated inositol can act as signalling molecules on and off the lipids. Since inositol can have 6 phosphates it has been used as phosphate storage in plants. I tend to think inositol isn’t something I should pursue with respect to origins but that could change.

Otherwise what is interesting is that with the arrival of aspartate we have all of the basic ingredients for pyrimidines, bicarbonate and aspartate (and ammonia and phosphate).

The reactions from PurK to PurC are a place to root thinking about the appearance of a second family of nucleotides, and the identity of any early purines with a single ring with multiple extensions as the pathway progresses. Did NCAIR and CAIR ever extend as nucleotides themselves? AIR? Was a SAICAR ever part of a chain of nucleotides?

And all that has been needed chemically for purines was a phosphate. And phosphate control evolved early. Earlier than purine biosynthesis directly. The following paper looked at kinds of metabolism and parts of purine metabolism relating to phosphate control are among the oldest parts.

Structural Phylogenomics Reveals Gradual Evolutionary Replacement of Abiotic Chemistries by Protein Enzymes in Purine Metabolism. Caetano-Anollés 2013

The following refers to Figure 1 in the results, the image is too large for WordPress to let me upload.

The most ancient enzymes (colored in deep red) are located in a horizontal transect of the subnetwork that is responsible for nucleotide interconversion, supporting previous evidence that metabolism originated in enzymes harboring the P-loop hydrolase fold and responsible for these kinase functions

Stage 3.2: aspartate leaves as fumarate, leaves its nitrogen.

No phosphate is needed for PurB to do its work. Stage 3 is a 2-step version of what the glutamine ammonia dispensers did previously. This way of delivering nitrogen occurs in arginine biosynthesis too. I can’t help but think of an aspartate accumulation step in history, between that and all of the other things on the big drawing that use or start with aspartate.

And what is fumarate? Fumerate is an intermediate in the TriCarboxylic Acid (TCA) cycle,or “citric acid cycle” or “Krebs cycle”.

The Tricarboxylic acid cycle. I have added some cofactors and molecules that are gained or lost in the reactions.
The pyruvate and 2-oxogluterate dehydrogenase complexes are multi protein macromolecular complexes where carbon skeletons are passed from thiamine to lipoamide to coenzyme-A. I think part of our history involves being macromolecular complexes in mineral systems.

This cycle is where the precursor to glutamate comes from (2-oxogluterate), and the  precursor to aspartate is involved (oxaloacetate). Add glycine at succinyl-CoA and you get the pathway leading to heme and chlorophyll (not shown). If I had to stereotype the TCA cycle it would be about converting a 4C (oxaloacetate) and a 2C (acetyl-CoA) into a 5C (2-oxogluterate). It also produces protons via NAD(P) (can be used to regenerate ATP), and GTP/ATP from phosphate and GDP/ADP. So the removal of fumerate may tie purines to the TCA cycle. This cycle has a reverse version like glycolysis and gluconeogenesis, there are separate enzymes that do irreversible steps (arrows that just go in one direction).

Fumerate also works as an electron acceptor in electron transport chains.

As for the protein domains making up PurB, the situation is reminiscent of PurM. PurB is divided into 3 subdomains and most of the same proteins are in all 3. But the 3 subdomains may be more closely related to each other with PurB, they are all in the same architecture bin where the N and C ends of PurM are in different architecture bins.

(As I type this ECOD seems to have a problem where it somehow has Alpha Arrays architecture in the Beta Barrels bin. When I first did this all 3 domains were in Alpha Arrays, now the C-terminal region is in Beta Barrels. There are no barrels made from Beta Sheets in this domain so I will keep it as Alpha Arrays (arrays of Alpha Helices). There are a few mistakes on the big drawing due to this issue which I will correct. Specifically some Beta Barrels are actually Alpha Arrays. Otherwise the relatives of this domain all seem there.)

PurB is in a group called “L-Aspartase-like” and there are “L-Aspartase-like-N”, “L-Aspartase-like-M” (middle), and “L-Aspartase-like-C” X bins. In these bins are aspartase (useful under nitrogen starvation). Histidine ammonia lyase (histidine breakdown), phenylalanine ammonia lyase (plants) tyrosine and phenylalanine amino mutases (makes amino acids of opposite chirality) Adenylosuccinate lyase (PurB), delta crystallin (eye proteins), and fumerate hydratase (TCA cycle protein). The exceptions seem to be 3-carboxy-cis,cis-muconate cycloisomerase (benzoate degradation) absent from the N-terminal end, and the histidine, phenylalanine, and tyrosine proteins seem absent at the C-terminal end but I need to look at things with another archive to be sure.

Stage 4.1: formylation of the aspartate derived nitrogen.

This stage and the following step is a place where life has more than one path. In some organisms the bifunctional PurHJ, (sometimes just called PurH) carries out the steps sequentially. In other organisms PurP and PurO carry out the steps. The following paper covers how widespread these proteins are distributed among prokaryotes and the authors named the PurH associated formylation protein PurV and the following protein PurJ when the functions are in separate proteins.

Different Ways of Doing the Same: Variations in the Two Last Steps of the Purine Biosynthetic Pathway in Prokaryotes Cruz 2019

PurJ is in the A+B3LS architecture bin, and the MGS-like X, homology, and topology bins (methylglyoxal synthase) so these are all related. PurJ is listed in ECOD as bifunctional protein PurH. CarB of pyrimidine biosynthesis is in here too. Lastly there is methylglyoxal synthase which provides an alternate way to metabolize 3C-phosphates when phosphate is limiting. Notably PurV activity does not require phosphate but does use folate like PurN.

PurP is familiar because it is an ATPgrasp enzyme. It also has the preATPgrasp domain, but not the C-terminal CODH-MO-N-like domain that I find so interesting. So similarly to Stage 1 life has ways of formylating with and without phosphate. But in this case PurP seems to be limited to archaea. The above paper found no bacteria using PurP. Most bacteria use PurH (the whole thing) and some have separate PurJ and PurV/PurO. Some bacteria and archaea have no identified proteins for Stage 4, autocatalytic?

Stage 4.2: closure of the second ring and the formation of IMP.

PurV is in the A+B3L architecture bin (not a sandwich), and cytidine deaminase-like X, homology and topology bins. There are a lot of proteins in here and modification of poly RNA or DNA seems to be a theme, editing enzymes. There is a whole level where RNA can be edited after transcription to change protein sequence or make other function changes. I’m amused by the tRNA adenosine deaminase TadA (TA-DA!). There are other adenine deaminases. There are cytosine deaminases, and guanosine deaminases. There is a pre-mRNA splicing factor, Prp8. RadC which is thought to be a nuclease (cuts polynucleotides).

Isopeptidase activity is associated with this group. An isopeptidase is a protein that cuts a protein outside of the main protein sequence, the relevant example here being ubiquitin-like proteins that are attached to a lysine residue (or attached to another side chain in other isopeptidases). This system is where ubiquitin tags proteins destined to be broken down in the proteasome (the cellular trash can) is eukaryotic but is related to systems in bacteria and archaea. Prokaryotic relatives of the ubiquitin system is something I will need to read more about.

LpxL is in here and that is a UDP-2,3-diacylglucosamine pyrophosphatse (removes diphosphate at some point). That is interesting due to sugars and lipid related things bound to nucleotide handles in the previous post, but is a minor bit of chemistry in this group.

FdhD is interesting. That is a sulfur carrier required for the function of formate dehydrogenase. This isn’t the first sulfur carrier/ubiquitin-like system connection I have seen. ThiS and MoaD are ubiquitin-like. Maybe there is a sulfur carrier link to isopeptidase-like (deubiquitinase-like) proteins as well?

PurO

PurO is in a group I have already covered, the A+B4L architecture bin, the NTN/PP2C X bin, and NTN (N-Terminal amiNohydrolase) homology bin. It’s a bit like coming full circle at Inosine MonoPhosphate (IMP) since PurF was at the beginning and contains one of these domains. PurO is in the proteasome subunits family bin. This is interesting because while PurO joins parts to make a ring, proteasome subunits cut proteins.

Now with second ring closure something that gets an abbreviation like AGCT, I, IMP. I is never found in transcription directly, but some of those editing enzymes can turn something into I. That affects codon base pairing in translation leading to different effects. tRNAs can contain I.

Nucleotides as intracellular signalling molecules.

Inosine is also an example of a whole level I haven’t read about and integrated into my thinking yet, it is an intracellular signalling molecule like cyclic AMP or GMP derivatives. AICAR is a signalling molecule too. Guanosine tetraphosphate is a signalling molecule.

And these are all signalling molecules (“alarmone” is used with many of these) in addition to AICAR (called ZMP and ZTP as signalling molecules) and IMP. ZTP and IMP act in regulation of steps of purine biosynthesis. I found a paper looking for these systems in some archaea with a big table.

Putative Nucleotide-Based Second Messengers in the Archaeal Model Organisms Haloferax volcanii and Sulfolobus acidocaldarius. Braun 2021

Many are thought to be ancient.

Alarmones as Vestiges of a Bygone RNA World. Hernández-Morales 2019

Cyclic AMP (cAMP) shifts metabolism in response to things. In bacteria it is high when things other than glucose are carbon sources. Glucose is like an ancestral default sugar maybe. cAMP has many other functions like osmotic pressure and DNA damage. The rest of them are hard to find stereotyped roles for. It’s the cutting edge of this research.

Cyclic GMP (cGMP) is involved in UV stress adaptation and cyst (suspended animation) formation. There is a cIMP, cCMP and cUMP too but very little is known about their role.

c-di-AMP is also involved in osmotic pressure. c-di-GMP is involved in transition from motile to sessile living. cGAMP is only found in metazoa, so no origins role. But I drew a 2,3 cGAMP and there is a 3,3 cGAMP that is involved in viral infection.

Ap(n)A has different forms and Ap4A is best known. Ap4A is an alarmone for many stress conditions. I couldn’t find anything on Gp(n)G. (p)ppGpp is also involved in stress signalling.

That’s it for stages 3, 4 and an area of nucleotide biochemistry I still need to look at.

General Biochemical Patterns in Purine Biosynthesis and Protein Relatives. Part 3

Stage 1, formation of the first purine ring (a 5-membered C and N ring is a pyrrole) is not only complete, I have covered a bunch of protein families I can refer back to.

I did not expand on the relatives of one family that I will expand on in this post though. The preATPgrasp domain relatives are a group I wasn’t sure how I wanted to discuss yet.

Metabolism broadly.

This is also a good point to mention that purine biosynthesis merges with 2 other metabolic pathways here. AIR can be made into the 6-membered ring (a pyridine) of thimine, HMP. What I did not put on the big drawing this time was that AIR can also be made into the “lower ligand” of vitamin B12. A ligand is an atom or molecule that binds to a molecule or protein to help carry out its function (the cyanide groups in the last post were ligands). B12 itself is a chelator of cobalt. But there are other lower ligands (potentially less origins related) and I was running out of space.

Stage 2: binding and movement of bicarbonate.

Life has 2 ways of doing the first part of stage 2 but I used the more complicated route because the direct binding of carbon dioxide to where bicarbonate moves to (by class 2 PurEs) is only in animals.

Stage 2.1: bicarbonate binding.

PurK is the protein that binds a bicarbonate to the nitrogen sticking out from the ring. And we have already seen these protein subdomains. It’s preATPgrasp, ATPgrasp, and Carbon monoxide dehydrogenase molybdoprotein N-like again. So for most of what is interesting I will refer to a previous post.

As an observation I would point out that all that is really needed to make these reactions work so far is phosphate. Every reaction but one so far uses a phosphate in active site chemistry, and the exception is the PurN/PurT step where life can add formate without phosphate, and yet PurT uses phosphate anyway in other organisms (E.coli has both).

What I will discuss this time are the relatives of the preATPgrasp domain. This fragment and its family members are in their own topology bin and there are 13 other topology siblings within a larger Rossman-related homology group, inside of a Rossman-like bin, inside of the A+B 3 layer sandwiches.

There are a lot of Rossman-related domains. If I had to stereotype the group based on the associations I would label them “nucleotide binders with a bias for Adenine containing cofactors”. But that is not based on structure, just looking in the bins and they don’t exclusively bind nucleotides.

The first 4 topology groups to mention are: a NAD(P) (niacin) binding Rossman-fold domain, a FAD (riboflavin)/NAD(P) binding domain, a DHS-like NAD/FAD binding domain, and a “Nucleotide binding domain” bin that seems to do all of the above. These are all electron/proton dispensers/acceptors, and the first 2 contain many families, too many to go over in detail. Glycosyltransferase Mag is flagella related so I left it out.

But there is a whole area of biochemistry related to “nucleotides as handles” that I saw inside of the above bins that I haven’t covered on the big drawing. You see Adenosine (adenine-ribose-phosphate) used as a handle for cofactors all over the big drawing, and some of the other bases are used as handles in the molybdenum cofactor part of the drawing. But all of the bases have jobs as handles and there is a range of things related to membranes and sugars that adenosine, guanosine, uridine, cytosine, and thymidine are involved in as handles.

The 5th topology bin is S-Adenosyl-Methionine-dependent (SAM) methyltransferases. This also contains many families and the adenine-ribose-phosphate would be a common denominator for the first 5 bins so far. SAM is not just for methyl transfer though, the adenine-ribose can be an “adenylyl” modification (part of the molybdenum cofactor biosynthesis, sulfur carrier step). Most of methionine can be bound to something as homoserine leaving methyl-sulfur-ribose-adenine as well. As far as I can see this domain is all about SAM methyl transfer.

Tubulin nucleotide binding domain-like is very interesting. The cytoskeleton uses GTP hydrolysis on a timer to determine if the subunits like actin are monomers or polymers. And actin is an intracellular highway for cell contents.

The NagB/RpiA/CoA transferase-like domain has a lot of interesting things. Some transcription repressors, 5, 10-methlenyltetrahydrofolate synthetase. Related to fatty acid synthase there are coenzyme-A transferases and acetyl-CoA hydrolase (makes acetate). Citrate lyase is in the TCA cycle. IF-2B is a translation initiation factor. Finally there is glucosamine-6P isomerase and very interestingly ribose-5P isomerase. That interconverts ribose and ribulose in the pentose phosphate pathway.

The MurCD/PglD bin contains things that work with UDP-sugars and derivatives. MurC and MurD attach an alanine and a glycine to UDP-N-acetylmurmate in making bacterial cell walls.

The activating enzymes of ubiquitin-like proteins, or “E1 enzymes” are mostly studied in eukaryotes where ubiquitin targets proteins for degradation in the proteasome. But the general machinery exists in prokaryotes too and the activation process involves an adenylyl modification of the ubiquitin-like protein (UBL), where the adenylyl is replaced by the E1 (E1-UBL link) prior to attachment of the UBL to another protein, “activation” of the UBL. ThiF (thiamine biosynthesis) is in here, and it activates ThiS (on the big drawing) in a similar manner to exchange the adenylate with sulfur. That is something I want to add to the big drawing like I did with molybdenum cofactor biosynthesis, and MoeB (molybdenum cofactor biosynthesis) is in this bin too. No NAD(P)/FAD/SAM in here.

Ubiquitin-like Protein Conjugation: Structures, Chemistry, and Mechanism Cappadocia 2017

The formate/glycerate dehydrogenase catalytic domain-like topology bin is interesting because every catalytic region could be relevant, and especially for smaller, more origins relevant molecules. There is S-adenosyl-l-homocysteine Hydrolase (methionine biosynthesis), and alanine dehydrogenase (alanine to pyruvate). Lysine oxioreductase, and dipicolinate synthase both in lysine metabolism. Glycerate, formate and lactate dehydrogenases. Also something called nicotinamide nucleotide Transhydrogenase. All of these require NAD(P) so it this may still just be cofactor binding regions.

The UDPG/MGDP dehydrogenase C-term topology bin doesn’t have much but it’s all nucleotide-sugar proteins. UDP-glucose is central to carbohydrates and polymers of them for storage and membrane parts. And at least one requires NAD.

Finally there are the aspartate/ornathine carbamoyl transferases. PyrB is in the pyrimidine pathway.

Stage 2.2: moving bicarbonate over 2 atoms.

PurE1 doesn’t use phosphate as mutase reactions change the structure of a molecule without adding or removing things, it’s a low energy process and doesn’t need ATP. The proteins and their various amino acid side chains can hold the molecule just right, and alter charge distributions such that the bicarbonate moves over. And this a protein domain family that I’ve already covered, the A+B3LS/Flavodoxin-like/GATase-like domain is the entire protein.

And bicarbonate is finally something in common with the pyrimidine pathway. Bicarbonate would have been in primordial oceans and would have seeped into the rock with water as well as contact with the outside of hydrothermal vent minerals.

General Biochemical Patterns in Purine Biosynthesis and Protein Relatives. Part 2.

In this post I will be looking at the rest of the proteins in stage 1 of purine biosynthesis, the related PurL and PurM. To this point an ammonia dispenser in PurF provided an ammonia that replaced the diphosphate on ribose in PRPP, an ATPgrasp ATP phosphate dispenser used a phosphate to add a glycine, and then a formate was added to the nitrogen on glycine with PurT/PurU, or PurN. This leaves us with FGAR, formylglycinamideribonucleotide.

Now I add steps 4 and 5. Swapping the glycine carbonyl with an imine, and closing the first purine ring.

Step 4: PurL and swapping the glycine carbonyl with an imine via phosphate.

PurL is a large protein, the largest on the drawing. To me large and usually complex means old. This protein uses a phosphate from ATP to swap the glycine carbonyl oxygen (C=O) for an imine (C=N). The ammonia again comes from glutamine. Even though it is probably old parts of it can be deleted for fun if I assume ammonia is in the vent.

PurL has 5 individual domains, 2 present in duplicate. I consider some parts more important than others. In the vent the entire glutamine ammonia dispenser might be unnecessary.

  • An Alpha Arrays/RuvA-C-like domain (needs corrected from Beta Barrels on the drawing), the PurL linker domain.
  • An A+B2L/Alpha-Beta Plaits domain, the PurS-like domain.
  • An A+B3L/Bacillus chorismate mutase-like/PurM-N-like domain (present in duplicate).
  • An A+B2L/Alpha-Beta Plaits/PurM-C-like domain (present in duplicate).
  • An A/B3LS/Flavodoxin-like/Glutamate amidotransferase class 1 domain (GATase1).

RuvA-C-like domain

Starting with the Alpha Arrays/RuvA-C-like domain. This is a linker domain that acts in long range communication between parts of the protein. It’s all by itself in its homology group so that’s that.

PurS-like domain

This second domain in PurL is an Alpha + Beta 2 Layers domain called “PurS-like” because it is sometimes a separate protein, but always required for PurL function so in that case they call it “PurS”. It’s function isn’t well known but it isn’t the part that makes ammonia or holds FGAR, maybe it just holds the other 2 domains together. This is also in its own homology group all alone.

PurM-N-like domain

This domain is in the “bacillus chorismate mutase-like” X group, which may be related to the N-terminal part of PurM/parts of PurL. And maybe not. There is an interesting set of things in here and I consider this part and the following PurM-C-like domain to be the most important.

First is HypE which is involved in maturation of hydrogen utilization proteins with complex iron-nickel centers, hydrogenases. They can turn H2 into 2 protons and 2 electrons and back again. When it comes to the concept of energy, cells need to have and control a pool of protons and electrons, using things like niacin and riboflavin.

HypE binds a carboxamido (N-C=O) on it’s C-terminal cysteine, that came from carbamoyl phosphate (as in pyrimidine biosynthesis), and delivered by HypF. ATP is then used to dehydrate it to a cyanide (nitrogen triple bonded to carbon). This cyanide and a second is then bound to an iron atom (On HypC-HypD) used to make the finished iron center that is combined with a nickel atom.

Structural Insight into [NiFe] Hydrogenase Maturation by Transient Complexes between Hyp Proteins Miki 2020

Selenide, water dikinase, also known as selenophosphate synthetase, SelD. Synthases and synthetases both ligate (connect) things together but synthetases need a source of energy like ATP.

This combines selenium and phosphate to make selenophosphate which is the form used by cells. Selenophosphate is then mostly used as the 21st amino acid in a system that replaces serine with sersec (Sec, U) on the tRNA, or an RNA modification. The serine translation system is used and individual sec replacements are made during translation, hence the amino acid sersec. This is in bacteria, archaea and eukarya.

The selenophosphate synthetase family: A review Manta 2022

Finally thiamine monophosphate kinase (ThiL, on the big drawing) makes mature thiamine diphosphate.

I type mature because as far as I have found thiamine and thiamine monophosphate do nothing. It is thiamine diphosphate that is mostly used by cells, with some thiamine triphosphate and adenosine thiamine diphosphate. Those last 2 are somehow associated with glucose availability and amino acid starvation, and the opposite, amino acids and no glucose, respectively.

Update on Thiamine Triphosphorylated Derivatives and Metabolizing Enzymatic Complexes Bettendorff 2021

PurM-C-like domain.

And just like that we’re done with this domain because again we have PurL, PurM, HypE, selenophosphate synthetase, and ThiL. The only difference is this part of the protein is an Alpha+Beta 2 Layers and then in the Alpha-Beta Plaits X group, within which there is no guarantee of relationships outside of the general alpha helix and beta sheet organization.

Class 1 glutamine amidotransferase domain.

This is the glutamine ammonia dispenser for this protein. GATase class 2 we saw with PurF, a GATase2. The larger family this is a part of, the X:Flavodoxin-like and H: GATase1-like A+B3LS domains, appears many times on the big drawing. There is a lot to point out here. 20 related topology bins are siblings with the GATase1-like bin. And that topology bin has a lot besides the GATase family.

First the GATase1-like topology group itself has a lot of things already on the drawing. The current PurL, GuaA, CarA, PyrG, AroE, TrpG, PabA, FolD HisH, MetA and Pdx2. There are more but they are in different topology bins. This topology group also has protease 1, peptidase S51, peptidase C26, peptidase S66, catalase 1 (breaks down peroxide), isocyanide dehydratase, and beta-galactosidase.

The same topology bin as AroE and FolD (amino acid dehydrogenase-like-N) also contains Glu/Phe/Trp/Leu/Val (amino acids) dehydratases, malate oxioreductase (makes pyruvate from a TCA cycle intermediate), and at least 2 pterin (folate relative) proteins, methylene tetrahydromethanopterin dehydrogenase and tetrahydrofolate dehydrogenase/cyclohydrolase.

Pterin related, in its own topology bin is F420 dependent methylene tetrahydromethanopterin dehydrogenase.

In separate topology bins and also on the big drawing we have PurE, RibH, AroB (in a bin with glycerol dehydrogenase), and AroQ.

Things that grab things and move them is a theme for the Periplasmic Binding Proteins-like 1 and Chelatase- like bins. In bacteria with double membranes the space between them is called the periplasm. Periplasmic binding proteins bind things like amino acids (leucine here) and sugars (glucose, galactose, arabinose here). Chelatases bind and insert tetrapyrroles like heme and chlorophyll, which in turn often bind, “chelate” metal ions. This bin also contains iron-molybdenum binding proteins, zinc transporters, cobalt chelators, vitamin B12 transporters, a iron chelators.

Relatedly in its own topology bin is uroporphyrinogen 3 synthase, or HemD. This protein turns a linear chain of 5-membered rings into a circular ring heme/chlorophyll precursor, uroporphyrinogen 3. These often feature a chelated metal ion in mature form.

A last grabber/holder is the ATC-like topology bin, where the “ATC domain” contains a bunch of cysteines in the sequence which likely hold iron-sulfur centers, and is related to aspartate and glutamate racemases.

In a topology bin by themselves is phosphofructokinase and NAD (niacin) kinase. A glycolysis protein and the protein that makes NADP from NAD, and thus switches niacin to the anabolic form from the catabolic form.

The CheY-like bin is named after a chemotaxis protein, and contains pili related proteins, HydB (a NiFe hydrogenase), sensor kinases (RcsC), a vitamin B12 binding domain which is in MetH (big drawing), and ornathine decarboxylase.

In a topology bin by themselves there is fucose and arabinose isomerase. Another separate topology bin contains YfiR which makes cyclic di-GMP. Yet another topology bin contains IlvD (isoleucine, valine, and coenzyme-A biosynthesis) and 6-phosphogluconate dehydratase.

And finally the interesting FabD/lysophospholipase-like topology bin contains acyl carrier (FabD) and transfer proteins. It also has parts of fatty acid synthase. Binding and carrying coenzyme-A is a big theme. Lysophospholipases separate fatty acids from the glycerol backbone.

Step 5: PurM

PurM closes the ring via a phosphate from ATP, and the same basic reaction mechanism as PurL.

Thanks to PurL, I’m done with PurM since it’s the same 2 fragments shared with HypE, selenophosphate synthetase, and ThiL.

That’s it for stage 1 of purine biosynthesis, forming the first ring.

General Biochemical Patterns in Purine Biosynthesis and Protein Relatives. Part 1.

In this post I want to start expanding on the kind of analysis I did in a previous post with the protein that adds glycine in purine biosynthesis, PurD. I looked at what proteins are related to each of the subdomains and noted the kinds of chemistry and molecules involved, and did some fun speculation. A family of phosphate dispensers and very small molecules with HO-C=O as part of their structure.

And these are still puzzle pieces so it’ll be like a list in places where there might be significance, but I don’t know much about them. First some words about the kind of data on the site I primarily use to look at the relatives of parts of proteins.

Evolutionary Classification Of protein Domains and bin importance: AXHTF.

ECOD and similar archives of data relating to the relatedness of nucleotides or proteins have hierarchies of bins to group by relatedness. The largest bin at ECOD is the Architecture or “A” bin which groups by the most general arrangements of protein 3D structure, the Alpha Helix  or the Beta Sheet, and how those relate.

Top Left: water is tetrahedral, polarized (protons+, electrons-), and constantly jostling around. Top Center-Left: a tripeptide has free rotations around it’s bonds and isn’t just abstract and straight, things rotate around in the water. Top-Right: a dipeptide showing more parts of the molecule than I typically use to indicate where positively and negatively polarized regions are (“charged” is for ions). Center: helices (like the Alpha Helix) are one common shape charges allow polypeptides to take. Bottom: sheets (like the Beta Sheet) are another common shape charges allow polypeptides to take. Most structure at this level are kinds of sheets and helices.

So you get things like: Alpha/Beta 3 layer sandwiches defined as “repeating beta-alpha units form a sandwich with a mainly parallel beta-sheet layer stacked between two alpha-helix layers” or Alpha+Beta 2 layers. These outermost levels can be all related but they can just as easily be independently arrived at shapes. The same is true of the “X” group which is “possible homologs” where homologs are related by sequence. The same problem exists in A and X but things might be related in X.

Below the X group we get the “H” or homology group where things are explicitly related based on sequence. Within H there is “T” or topology which means they have the same sequence but there are differences in structure. Finally “F” or family are protein subdomains most closely related to one another.

HTF are where I look the most.

Purine biosynthesis, Adenosine and Guanosine.

I see 5 stages and 13 steps in purine biosynthesis.

  1. Formation of the first ring (5 steps).
  2. Carbon addition and movement (2 steps).
  3. Aspartate addition and fumerate removal (2 steps).
  4. Formation of the second ring (2 steps).
  5. Adenosine or Guanosine monophosphate formation (2 steps, separate paths).

I will be limiting myself in this post to most of Stage 1: Ribose phosphate, glycine, formate and ammonia join up. This post got longer than I thought it would for all of stage 1 but building up ring closing as a specific thing could be useful.

The first 3 steps of purine biosynthesis.

Step1: PurF

This protein exchanges an ammonia for a diphosphate from Phospho-ribose-diphosphate (PRPP), making phospho-ribosyl-amine. This pattern of using phosphate to swap for other things occurs over and over. PRPP was made from ribose-5-phosphate with ATP by the gene Prs, which is related to the second domain in this protein. Cells have a PRPP pool because as the drawing shows, PRPP has multiple uses.

Phosphoribosyl Diphosphate (PRPP): Biosynthesis, Enzymology, Utilization, and Metabolic Significance. Hove-Jensen 2016

Also known as amidophosphoribosyltransferase (you see why I prefer short protein names? I usually like descriptive names), PurF has 2 subdomains:

  • An Alpha+Beta 4 layer (A+B4L)/NTN hydrolase domain at the end with the nitrogen (N-terminal, defined as the beginning of the sequence)
  • An Alpha/Beta 3 Layer Sandwich (A/B3LS)/PRTase-like domain at the carboxy or C-terminal end.

The “amido” refers to the fact that the ammonia comes from the amino acid glutamine, which becomes the amino acid glutamate after removing the nitrogen in the first step of PurF activity carried out by the NTN-hydrolase domain, making phospho-ribosyl-amine (PRA).

The A+B4L subdomain is in the N-terminal nucleophile aminohydrolase (NTN hydrolase) homology group, and class 2 glutamine amidotransferase topology and family groups (GATase2). In terms of primordial chemistry, the chemistry of an end of the protein chain is a significant thing. This N-terminal nitrogen protein superfamily has an affinity for nuclei (nucleophile) and breaking bonds with water (hydrolase). Without the need for ATP this protein domain removes the ammonia and that reactive molecule floats over to PRPP which is reactive towards ammonia.

The AB3LS domain at the C-terminal end is in a transferase subgroup of these proteins, PRTases, where Prs is a synthase. This subdomain essentially just holds the ribose-5P end. In fact at this point PRTases are just ribose phosphate holders to me. This particular family has one other synthase for OMP, and a bunch for purine and pyrimidine bases in general. The interesting part for PRTases is that the ones for niacin, histidine, and tryptophan are in different protein subdomain families on ECOD.

The rest of the entry for PurF will be on the A+B4L subdomain and what’s interesting there. In the sibling homology group to the NTN aminohydrolases there are protein serine/threonine phosphatase 2C members, specifically the part of the protein that does the removal, the catalytic domain. Phosphatases are proteins that remove phosphate from things, here serine or threonine. With all of the phosphate around a related family of phosphate removers is interesting. Looking at them more closely is a plan.

Inside the NTN hydrolase homology group there are 6 related topology groups, one of which is the GATase 2 group. Outside of GATases there are some interesting things. “Proteasome subunits” contains parts of the cellular trash can, the proteasome, and individual peptidases which are things that cut peptide bonds in proteins. This group also includes PurO which is in stage 4 of purine biosynthesis. There are some antibiotic biosynthesis enzymes for penicillin and carbapenam, and things that break C-N bonds other than peptide bonds. I’ll probably ignore antibiotics going forward, there are lots of them and they are unlikely to be usefully origins related.

Glycosyl-asparaginase removes aspartate from glucose-asparagine leaving glucose-amine (glycoprotein degradation). Gamma-glutamyl-transpepidases move glutamate to things and from things.

Inside of the GATase2 group is Asparagine synthase makes that amino acid from aspartate (via glutamate), and the protein that makes glutamate from 2-oxogluterate. There is a protein that makes glucosamine from glutamate and glucose-6P.

That’s it for PurF for now.

Step 2: PurD

I first introduced this protein with the big drawing.

This protein has 2 steps in its reaction. First the phosphate dispenser part releases a phosphate from ATP which binds the carboxy on glycine making glycine-P. This is reacted with PRA and the phosphate is swapped for the nitrogen on the ribose like in PurF, making glycinamide-ribonucleotide (GAR). As I mentioned previously this is the closest molecule to both nucleotides and polypeptides as it has peptide bond features.

This protein contains 3 subdomains in order:

  • An A/B3LS/Rossman-related preATPgrasp domain (thought to be substrate specificity)
  • An Alpha/Beta Complex Topology (A/BC.T.) ATPgrasp domain (the phosphate dispenser domain)
  • An A/BC.T./Alpha Beta hammerhead barrel hybrid sandwich domain related to the carbon monoxide dehydrogenase molybdenum protein (CODH MO-N-LIKE) and biotin carboxylases

There are patterns to the phosphate dispenser with and without the substrate specificity part, and patterns to CODH MO-N with the rest and by itself.

First of all there are 3 more ATPgrasp proteins in purine biosynthesis, and the ATPgrasp domain by itself is related to purine biosynthesis protein PurC, SAICAR synthase, and a bunch of protein kinases (put phosphate on serin, threonine or tyrosine). Both of those are sibling homology groups with a bunch of protein kinases, things that put phosphate on serine, threonine, or tyrosine in a polypeptide chain, usually to change the function of a protein somehow.

When all 3 subdomains are together you have PurD, PurT, PurK (Pur=purine biosynthesis), L-amino acid ligase (attaches an L-amino acid to things), and a set of things including biotin, acetyl-CoA, and pyruvate carboxylases.

PurP only has the preATPgrasp and ATPgrasp domains. Also pyrimidine pathway CarB.

When the phosphate dispenser is alone, no preATPgrasp, you have a collection of interesting things. RNA and DNA Ligase (protein that attaches things). Ligases for glutamate (GshA, from the ancestral fragment post) or tyrosine and tubulin. Citrate lyase, succinyl-CoA synthase, both related to the TCA cycle. PEP (phosphoenolpyruvate) synthetase, a protein involved in the first step of gluconeogenesis (making glucose instead of breaking it down for energy) in some prokaryotes.

When the CODH MO-N-LIKE bit is by itself you have purine breakdown protein xanthine dehydrogenase chain B, which adds an oxygen. There’s the CODH MO-N itself. There is aldehyde oxioreductase.

This CODH MO-N-LIKE fragment gets more interesting when you zoom out one level. This is the longest part of this post.

First we have MoaE, molybdopterin synthase. That’s on the drawing! And there’s another on the drawing with the PRTase that attaches ribose-5P to the ring in making niacin. There are a couple of proteins from the large ribosomal subunit too, L10 and L27. The L27 topology bin also has parts of the “exosome complex” which degrades RNA, another trash can.

The “duplicated hybrid motif” bin contains 3 interesting things. The first is part of a system that uses the phosphate from PEP and passes it along several other proteins to a membrane protein (the PTSD E2A fragment here) that phosphorylates a cargo, usually a sugar, during import. The second is a family of peptidases including one that cuts between glycines (LytM). The third is the RNA polymerase beta prime subunit.

Finally there is the “single hybrid motif” which has a lot of things. First there is the RNA polymerase beta subunit (as opposed to beta prime). Second the protein domain that binds both cofactors lipoate and biotin are related to each other and are in here. Those are newer cofactors so I haven’t done much with them. Third is the glycine cleavage system H protein, this is very significant as this is where cells get glycine and serine through interconversion. Fourth AstE_AspA are Succinylglutamate desuccinylase/aspartoacylase. The first is an arginine degradation protein, and the second is too but seems limited to eukaryotes (or isn’t studied in prokaryotes due to the human disease emphasis).

New paragraph due to theme difference. The fifth thing in the “single hybrid motif” is the V-type ATP synthase alpha chain. The sixth is RnfC, a protein involved in anaerobic electron transport with niacin and pumping protons or sodium ions. Relatedly the seventh is cytochrome F, something that passes electrons in electron transport chains. NqrA is similarity involved niacin and electron/sodium transport.

I’ll finish up this with the eighth and ninth, NusG and Rrp15p. They are both involved with ribosomes. NusG is a major transcription (copying DNA) termination factor, and Rrp15p is required for cells to make the large ribosomal subunit.

Step 3: PurT and PurU or PurN

In this step a formate is removed from folate and attached to the nitrogen at the end of the glycine that was just attached, to make formyl-glycin-amide-ribonucleotide, FGAR.

This is something I will have to fix on the big drawing but life has more than one protein for the third step of purine biosynthesis. I have PurT and PurU, but not PurN. We’ve technically seen the relations of step 3 PurT since it is also an ATPgrasp protein (all 3 domains).

PurU is the protein that removes the formate and passes it to PurT where the phosphate from ATP is used to make a reactive formyl-P that similarity to the previous steps is used to attach a carbon to a nitrogen.

This is where things get easier for PurN because PurU is related to PurN so I can cover them together, though PurU has more pieces, suggesting the most important part is the one they share.

PurU has 2 domains:

  • An Alpha/Beta 3 layer sandwich/formyltransferasesdomain (A/B3LS).
  • An Alpha +Beta 2 layer(A+B2L)/Alpha-Beta Plaits domain.

The first domain that is related to PurN in its entirety. The homology and topology bins are formyltransferases. The most interesting ones are methionyl-tRNA Fmet formyl transferase (adds formate to the initiator methionine in translation) and 10-formyltetrahydrofolate dehydrogenase (removes formate from folate as CO2 while adding a proton to niacin).

The second domain in PurU is related to a lot of things but I’m thinking I can ignore those given PurN.

That’s it for now. It would have easily been over twice as long if I included PurL and PurM. The Alpha+Beta 2 Layers category has an “Alpha-Beta Plaits” with lots of things. It’ll be practice at sorting out general biochemical themes.

LGBTQ+ People Are Not Going Back.

I’m still generally fine with the sentiments in Julia Serano’s post.

1) I will not tolerate any backpedaling on LGBTQ+ rights whatsoever, and

2) If my representatives fail to strongly stand up against these attacks on LGBTQ+ rights, then I will take my vote elsewhere next election.

Words aren’t enough. Actions. Please contact your representatives to tell them the same.

Planned politics is good.

I’m passing this along.

“I propose that on Tuesday, December 3rd, 2024 (the first day that both the House and Senate are back in session), all of us who are invested in this issue and have a platform (whether it be a blog, newsletter, column, podcast, YouTube, TikTok, Instagram, etc.) publish a piece with the shared title: “LGBTQ+ People Are Not Going Back.” Yes, I know, it’s a cheesy title, but it holds Democrats accountable to their own talking points and makes it clear that backsliding on LGBTQ+ rights is nonnegotiable for us.”

Planned Action for LGBTQ+ & Allies in Response to Democrats Capitulating on Trans Rights by Julia Serano

Abortion

I try to keep my abortion position as independent of law as possible and the most important part of this post concerns this. That being said I’ll include legal and religious angles at the end that I think are useful. But these should not be used to debate, too often forced-birthers steal the political atmosphere with debate. Delight in giving orders, and stating things as the way things are. Learn to enjoy their opportunity to figure out their negative feelings about your actions.

What freedom looks like.
My position can be summarized as there is no right to force someone to give birth or stop someone from offering the services. Without forced-birthers people would just get abortions. That’s the freedom and liberty part.
As a group forced-birthers get by because they are a group, a mob, people who can force others to give birth themselves or through government. Relatedly any forced-birther is a stand-in for the people they put into power when it comes to shaming and criticism.

That’s it. They assert the power to use force on others with no good reason or rights to do so. If one were to want a legal position I’ve been confused as to why the 9th amendment has not been used as a reason, “The enumeration in the Constitution, of certain rights, shall not be construed to deny or disparage others retained by the people.” Including the right to go get or offer an abortion which harms the rights of no one else. They have to go to “spectral evidence” and speak for fetuses eventually, and I don’t give them that power to make up words on behalf of the unborn without minds or words.

I don’t even believe the christians among forced-birthers have their own theology right. When I was raised in that environment a big value was placed on sin and choice, they have to use force. They don’t even believe what they say. And it’s a fine for killing the unborn of another in their book. They have manipulation of language only.

Finally I’ve a way of dealing with their “baby talk”. It’s untested but rhetorically in the moment give them “baby” or “child” mockingly, but make the conceptus “baby 0.01”, the embryo “baby 0.5”. Be clear that you aren’t serious and that they simply emotionally require and depend on recalling feelings AFTER birth and can’t deal with reality. You can bet they’d destroy embryology to get their way if they had to.

The chirality issue

Note: the second figure has been replaced with the correct one, twice. A xylulose bond was in the wrong direction. This is hard.

There are a couple of other origin of life issues that I haven’t focused on yet. One that I was hoping would reveal itself is the issue of “chirality”, or the side of a molecule a bond or atom is on relative to the rest of the molecule in 3D. The strict definition includes things with identical chemical formulas having non-superimposable structures.
I don’t tend to draw in 3D to simplify things. I also don’t include hydrogen to simplify things.

I meant to address this in a previous post where I posted this figure of thiamine and parts of the pentose phosphate pathway. It’s why there is a single hydroxal on xylulose-5P drawn in 3D as if it’s coming at you out of the page. I’ll get to that at the end.

I’ve drawn ribose and xylulose in 3D. Both looking down the length of the molecule and from the side. Carbon actually binds 4 things and is tetrahedral like a pyrimid. So I put the hydrogens back into the upper structures and used conventional dashed and filled wedges to represent bonds going into and out of the plane of the page respectively.

Each of those hydroxals could be where the hydrogen is on the carbon instead. If you flipped the hydrogen and hydroxyl on ribose you can see how the formula would be the same but the structures would not be superimposeable. Things on different sides are designated with R/S or D/L distinctions that refer to single atoms on the structure. D-ribose refers to the stereochemistry around the atom farthest from the aldehyde group in the form of ribose used by life.

Amino acids are all L- relative to the side chains aside from glycine which has no atoms with stereochemistry.

So why is life one way and not the other when it comes to these bond directions? I don’t know but I noticed that the only sugar with a bond in the other conformation is xylulose-5P so I drew that one bond with 3D on the thiamine figure. The rest are in the other confirmation. And xylulose is the donor for the pentose phosphate pathway 2C molecules. This isn’t a solution to the chirality problem but it is a clue.