C3orf52
| C3orf52 | |||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
| Aliases | C3orf52, TTMP, chromosome 3 open reading frame 52 | ||||||||||||||||||||||||||||||||||||||||||||||||||
| External IDs | OMIM: 611956; MGI: 2384848; HomoloGene: 11622; GeneCards: C3orf52; OMA:C3orf52 - orthologs | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Chromosome 3 open reading frame 5 (C3orf52), also known as TTMP or TPA-Induced Transmembrane Protein (accession: NP_078892),[5] is an uncharacterized protein encoded in humans by the C3orf52 gene. C3orf52 (accession: NM_024616)[6] is located on the plus strand of chromosome 3, at gene locus q.13.2 and spans approximately 31,822 base pairs. C3orf52 encodes a transmembrane protein whose predicted function is to act as a membrane -associated regulatory factor in epithelial tissues. Structural features within the protein, including a conserved transmembrane domain, a SEA domain, and extensive glycosylation, suggest a role in protein-protein interactions. Evidence supports that C3orf52 acts as a cofactor for LIPH, facilitating localized lysophosphatidic acid production required for hair follicle morphogenesis. Loss of C3orf52 disrupts this lipid signaling pathway, resulting in autosomal recessive hypotrichosis.
Expression patterns
Immunohistochemical micrographs from the Human Protein Atlas demonstrate C3orf52 expression in colon and stomach tissues. Localization is concentrated along the luminal borders of the epithelial cells in the colon, and is highly abundant in glandular cells. These findings suggest that C3orf52 exhibits membrane-associated expression.[9]
Analysis of human tissue expression indicates that C3orf52 displays approximately fourfold variation across tissues. C3orf52 is tissue-restricted and highly regulated. The highest expression is observed in the thyroid and salivary glands (Figure 2), while other sources report moderately high expression in the skin, pancreas, and stomach.[10] Expression in other analyzed tissues, such as the heart and brain, shows low to undetectable.
Molecular features
mRNA
C3orf52 has two transcript variants. Transcript variant 1 is a shorter transcript (753 nucleotides) but encodes a longer isoform (TPA-induced transmembrane protein isoform 1). Transcript variant 2 (654 nucleotides) encodes a shorter isoform (TPA- induced transmembrane protein isoform 2) with a different C-terminus.[6] Transcript variant 2 consists of six exons.
Protein
C3orf52 encodes two isoforms. TPA-induced transmembrane protein isoform 1 is 250 amino acids long, while isoform 2 is 217 amino acids long.[6] Isoform 2 is the predominant isoform of C3orf52 in humans, and was used as the basis for subsequent research on this protein. The C3orf52 protein includes a disordered region, a transmembrane region, and a major polyA site.[5] The molecular weight is of 24.3 kDa.[12] This protein is predicted to be localized primarily in the cytoplasm (94.1%), with specific localization to the endoplasmic reticulum (44.4%).[13]
Analysis of the human C3orf52 protein using the SAPS tool showed no significant positive, negative, or mixed charge clusters, and no known sequence patterns were identified.
Compositional analysis indicates reduced levels of alanine, histidine, and arginine residues and an increased number of glutamic acid residues compared to standard human protein levels, indicating that this protein is acidic. C3orf52 has a predicted isoelectric point of 3.99.[12] Additionally, results show a high-scoring transmembrane segment spanning amino acids 66 to 93, which is also among the protein's most hydrophobic segment. C3orf52 contains a repetitive four-amino acid motif, including a sequence reading "LELS" at amino acids 13-16 and repeating at positions 101–104, this region is not conserved among orthologs.
The C3orf52 protein has a predicted SEA (Sea urchin sperm protein, Enterokinase, Agrin domain) domain spanning from amino acid positions 128–172. Approximately half of the amino acids in this domain of the human protein are conserved among 70% of the orthologs listed in Table 1. However, all orthologs in the table have this domain in their corresponding region of the protein, although they contain different residues. SEA domains are common in eukaryotes and are found within extracellular proteins located in highly glycosylated environments.[14]
Post-translational modifications
Human C3orf52 is predicted to contain three phosphorylation sites at positions 140,180, and 183, two N-glycosylation sites at positions 106 and 159, and three O-linked glycosylation sites at positions 7, 26, and 37.[15][16][17] All of the O-linked glycosylation sites are within the disordered region of this protein. This indicates that C3orf52 is a moderately regulated protein that likely functions more as a scaffold than as a structural protein.
Evolutionary history
Paralogs
An NCBI protein BLAST search showed no known paralogs of C3orf52 in humans.[18]
Orthologs
C3orf52 retains its sequence with identifiable orthologs exclusively within vertebrates (see Table 1 below), specifically up to the elasmobranchii lineage, but is absent in more distantly related groups. Outside of vertebrates, including all invertebrate animals, bacteria, archaea, fungi, plants, and protists- no identifiable sequence homology is detected for this protein. The most distant homolog detected of C3orf52 was in the elephant shark (Callorhinchus milli), which diverged about 495.2 million years before humans, suggesting that this is approximately when the C3orf52 gene first arose.
| Genus and species | Common name | Taxonomic group | Date of divergence (MYA) | Sequence length (AA) | AA identity (%) | AA similarity (%) | Accession number |
|---|---|---|---|---|---|---|---|
| Homo sapiens | Human | Primates | 0 | 217 | 100 | 100 | NP_078892.3 |
| Pan troglodytes | Chimpanzee | Primates | 6.4 | 217 | 97 | 98 | XP_001154447.2 |
| Trachypithecus francoisi | Francois' leaf monkey | Primates | 28.8 | 217 | 89 | 93 | XP_033067213.1 |
| Equus przewalskii | Przewalski's horse | Perissodactyla | 94 | 250 | 77 | 87 | XP_008528528.1 |
| Diceros bicornis minor | South-central black rhinoceros | Perissodactyla | 94 | 251 | 74 | 85 | XP_058411758.1 |
| Tachyglossus aculeatus | Australian echidna | Monotremata | 180 | 214 | 52 | 67 | XP_038622330.1 |
| Struthio camelus | Common ostrich | Struthioniformes | 319 | 218 | 37 | 54 | XP_068764855.1 |
| Empidonax traillii | Willow flycatcher | Passeriformes | 319 | 211 | 36 | 53 | XP_027757055.1 |
| Mauremys reevesii | Chinese pond turtle | Testudines | 319 | 233 | 43 | 58 | XP_039395838.1 |
| Carettochelys insculpta | Pig-nosed turtle | Testudines | 319 | 214 | 41 | 56 | XP_074839613.1 |
| Chelonoidis abingdonii | Pinta island tortoise | Testudines | 319 | 218 | 40 | 54 | XP_074928790.1 |
| Ambystoma mexicanum | Axolotl | Caudata | 352 | 239 | 42 | 56 | XP_069492559.1 |
| Rhinatrema bivittatum | Rhinatrema | Gymnophiona | 352 | 217 | 41 | 60 | XP_029434280.1 |
| Pleurodeles waltl | Iberian ribbed newt | Caudata | 352 | 236 | 40 | 52 | XP_069060250.1 |
| Microcaecilia unicolor | Tiny cayenne caecilian | Gymnophiona | 352 | 225 | 39 | 58 | XP_030060067.1 |
| Siphateles boraxobius | Borax lake chub | Cypriniformes | 429 | 248 | 34 | 52 | XP_077063864.1 |
| Acipenser ruthenus | Sterlet | Acipenseriformes | 429 | 260 | 33 | 52 | XP_058884635.1 |
| Etheostoma spectabile | Orangethroat darter | Perciformes | 429 | 268 | 32 | 49 | XP_032360225.1 |
| Pseudorasbora parva | Stone moroko | Cypriniformes | 429 | 251 | 30 | 49 | XP_067307703.1 |
| Carcharodon carcharias | Great white shark | Lamnifores | 462 | 376 | 29 | 50 | XP_041067311.1 |
| Callorhinchus milli | Elephant shark | Chimaeriformes | 495 | 835 | 18 | 40 | XP_007894202.1 |
Protein divergence
C3orf52 is evolving more quickly than other common proteins including cytochrome c and fibrinogen alpha chain. This would suggest some sort of selective pressure on the protein driving its rapid evolution.
Protein interactions
| Protein | ID | Description | Detection method | Subcellular location |
|---|---|---|---|---|
| Hypotrichosis 7 | HYPT7 | Mutations in the lipase H gene are linked to autosomal recessive hypotrichosis | affinity chromatography technology | Endoplasmic reticulum and luminal side of membranes |
| Chromosome 19 open reading frame 75 | C19orf75 | Uncharacterized protein | affinity chromatography technology | Predicted membrane-associated |
| Neonatal fragment crystallizable receptor | FCRN | Neonatal Fc receptor. Responsible for transferring immunity from mother to newborn | affinity chromatography technology | Endosomal membrane, plasma membrane, and endoplasmic reticulum |
| Transmembrane protein 30B | TMEM30B | Subunit of P4-ATPase complex which is involved in the transport of lipids across the cell membrane | affinity chromatography technology | Endoplasmic reticulum and plasma membrane |
| B-cell antigen receptor complex-associated protein beta chain | CD79b | Forms the B-cell receptors and is required to initiate signal cascade when antigen binds to B-cell | affinity chromatography technology | Plasma membrane |
| Platelet glycoprotein 4 | CD36 | Multifunctional glycoprotein that acts as a receptor for molecules such as fatty acids, collagen, and thrombospondin | affinity chromatography technology | Plasma membrane and endosome |
Table 2. Protein interactions found using BioGrid. All interactions were physical interactions and had very confident significant scores.[19]
A STRING protein association search yielded no confident results, or ones that seem significant based on previous findings.[20]
Conceptual translation
Conceptual translation of the human C3orf52 protein isoform 2, along with full mRNA and annotations is shown in Figure 4.
Clinical significance
Several studies resulting in an initial information search on C3orf52 focused on the likely involvement of this gene in lipase H-mediated lysophosphatidic acid biosynthesis, a step in hair-follicle formation.[21] Evidence shows that decreased expression of C3orf52 has been linked to localized autosomal recessive hypotrichosis, a condition resulting in the absence of hair.[22] There were three relevant single-nucleotide polymorphisms found with clinical significance linked to hypotrichosis 15 (rs764787339, rs2472299130, rs545208237) (Table 2).
Apart from articles on the involvement of C3orf52 on hair loss, PubMed and Google Scholar provided a couple of other potential linkages between this gene and diseases, specifically a variety of cancers. One of the more eye-catching articles found associations of this gene in the development of multifocal and multicentric breast cancer, and is looking into it as a current marker for distinguishing multifocal and multicentric breast cancer from unifocal breast cancers.[23]
Another study proposes looking at C3orf52 as a potential marker as a prognosis gene of cancer in a study looking at DNA copy number variations, which are common in cancer cells.[24] Additionally, C3orf52 is linked to be downregulated in clear-cell renal cell carcinoma, and its reduced expression was linked to later disease stage and poorer overall survival of clear-cell renal cell carcinoma patients.[25]
Single nucleotide polymorphisms
| SNP | Position | Base change | AA change | Mutation type | Significance | Clinical significance |
|---|---|---|---|---|---|---|
| rs764787339 | bp 34
AA |
G to A
G to C G to T |
Glu12Lys
Glu12Gln Glu12Ter |
Missense variant
Missense variant Stop gained |
Within the disordered region; also within the very beginning of the coding sequence | Hypotrichosis 15 |
| rs2472299130 | bp 438 - 442
AA 148 |
N/A | Thr148fs
(frameshift variant) |
Deletion | Directly following a string of highly conserved AA | Hypotrichosis 15 |
| rs545208237 | bp 492 | T to A
T to C |
Tyr164Ter
Tyr164= |
Stop gained
Synonymous variant |
On a conserved AA | Hypotrichosis 15 |
| rs16859190 | bp 331
AA 111 |
A to G | I to V | Missense variant | On a non-conserved AA within a string of conserved AA | None |
|
rs340167 |
bp 430
AA 144 |
G to A
G to C |
G to R
G to C |
Missense variants | Within SEA domain | None |
| rs16859172 | bp 197
AA 66 |
T to C | L to P | Missense variant | On a conserved AA, within transmembrane region | None |
| rs111954756 | bp 566
AA 189 |
G to C | G to A | Missense variant | On a non-conserved AA within a string of conserved AA | None |
Table 3. Summary of common single-nucleotide polymorphism mutations within human C3orf52 including their position of occurrence and significance. Single nucleotide polymorphisms were found using variation viewer.[26]
| SNP | Trait | Location |
|---|---|---|
| rs12053863-? | Glucose metabolism
Hearing impairment |
3:112100677 |
| rs79754744-G | Eosinophil count | 3:112111457 |
| rs76093951-T | Bilirubin measurement | 3:112117176 |
| rs7649379-? | Eosinophil count | 3:112121666 |
| rs1492488-C | Eosinophil percentage of leukocytes | 3:112122499 |
Table 4. Summary of GWAS catalog results.[27] Majority of the results show immunity association within single nucleotide polymorphisms, particularly eosinophil count.
References
- ^ a b c ENSG00000285394 GRCh38: Ensembl release 89: ENSG00000114529, ENSG00000285394 – Ensembl, May 2017
- ^ a b c GRCm38: Ensembl release 89: ENSMUSG00000033187 – Ensembl, May 2017
- ^ "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
- ^ "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
- ^ a b "TPA-induced transmembrane protein isoform 2 [Homo sapiens] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2025-12-12.
- ^ a b c "C3orf52 chromosome 3 open reading frame 52 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2025-12-12.
- ^ GeneCards Human Gene Database. "C3orf52 Gene - GeneCards | TTMP Protein | TTMP Antibody". www.genecards.org. Archived from the original on 2024-05-09. Retrieved 2025-12-12.
- ^ NCBI GEO Profiles
- ^ "C3orf52 protein expression summary - The Human Protein Atlas". www.proteinatlas.org. Retrieved 2025-12-12.
- ^ "Home - GEO Profiles - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2025-12-12.
- ^ a b "AlphaFold Protein Structure Database". alphafold.ebi.ac.uk. Retrieved 2025-12-01.
- ^ a b EMBL-EBI, European Bioinformatics Institute. "Job Dispatcher homepage | EMBL-EBI". www.ebi.ac.uk. Retrieved 2025-12-01.
- ^ "PSORT II Prediction". psort.hgc.jp. Retrieved 2025-12-03.
- ^ Pei J, Grishin NV (March 2017). "Expansion of divergent SEA domains in cell surface proteins and nucleoporin 54". Protein Science. 26 (3): 617–630. doi:10.1002/pro.3096. PMC 5326570. PMID 27977898.
- ^ "PhosphoSitePlus". www.phosphosite.org. Retrieved 2025-12-12.
- ^ "SignalP 6.0 - DTU Health Tech - Bioinformatic Services". services.healthtech.dtu.dk. Retrieved 2025-12-12.
- ^ "NetOGlyc 4.0 - DTU Health Tech - Bioinformatic Services". services.healthtech.dtu.dk. Retrieved 2025-12-12.
- ^ "BLAST: Basic Local Alignment Search Tool". blast.ncbi.nlm.nih.gov. Retrieved 2025-12-12.
- ^ "BioGRID | Database of Protein, Chemical, and Genetic Interactions". thebiogrid.org. Retrieved 2025-12-12.
- ^ "STRING: functional protein association networks". string-db.org. Retrieved 2025-12-12.
- ^ Shah K, Basit S, Ali G, Ramzan K, Ansar M, Ahmad W (June 2021). "A novel homozygous frameshift variant in the C3orf52 gene underlying isolated hair loss in a consanguineous family". European Journal of Dermatology. 31 (3): 409–411. doi:10.1684/ejd.2021.4053. PMID 34309526.
- ^ Malki L, Sarig O, Cesarato N, Mohamad J, Canter T, Assaf S, et al. (July 2020). "Loss-of-function variants in C3ORF52 result in localized autosomal recessive hypotrichosis". Genetics in Medicine. 22 (7): 1227–1234. doi:10.1038/s41436-020-0794-5. PMC 7405639. PMID 32336749.
- ^ Kang Z, Guo L, Zhu Z, Qu R (2020). "Identification of prognostic factors for intrahepatic cholangiocarcinoma using long non-coding RNAs-associated ceRNA network". Cancer Cell International. 20 315: 315. doi:10.1186/s12935-020-01388-4. PMC 7364620. PMID 32694937.
- ^ Iranmanesh SM, Guo NL (January 2014). "Integrated DNA Copy Number and Gene Expression Regulatory Network Analysis of Non-small Cell Lung Cancer Metastasis". Cancer Informatics. 13 (Suppl 5): 13–23. doi:10.4137/cin.s14055. PMC 4218678. PMID 25392690.
- ^ Mlcochova H, Machackova T, Rabien A, Radova L, Fabian P, Iliev R, et al. (August 2016). "Epithelial-mesenchymal transition-associated microRNA/mRNA signature is linked to metastasis and prognosis in clear-cell renal cell carcinoma". Scientific Reports. 6 (1) 31852. Bibcode:2016NatSR...631852M. doi:10.1038/srep31852. PMC 4994011. PMID 27549611.
- ^ "Variation Viewer". www.ncbi.nlm.nih.gov. Retrieved 2025-12-12.
- ^ "GWAS Catalog". www.ebi.ac.uk. Retrieved 2025-12-12.