FAM200B

FAM200B
Identifiers
AliasesFAM200B, family with sequence similarity 200 member B, C4orf53
External IDsHomoloGene: 122130; GeneCards: FAM200B; OMA:FAM200B - orthologs
Orthologs
SpeciesHumanMouse
Entrez

285550

n/a

Ensembl

ENSG00000237765

n/a

UniProt

P0CF97

n/a

RefSeq (mRNA)

NM_001145191

n/a

RefSeq (protein)

NP_001138663

n/a

Location (UCSC)Chr 4: 15.68 – 15.71 Mbn/a
PubMed search[2]n/a
Wikidata
View/Edit Human

FAM200B (Family with sequence similarity 200 member B), is a protein which in humans is encoded by the FAM200B gene.The gene encodes a 657 amino acid protein.[4] The FAM200B protein is a large intracellular protein with no well defined functional domains. Structural data states there is no experimentally proven structures available, however predicted tertiary structures are available.[3] Expression data states FAM200B is expressed moderately and ubiquitously in all tissues, with relatively higher expression in the brain and thymus.[4] Although its function remains unknown, predicted nuclear localization and expression patterns suggest that FAM200B may be involved in regulatory processes such as gene expression or protein protein interactions.[4][5]

Gene

FAM200B also known as C4orf54, is a protein coding gene located on chromosome 4p15.32.[4] The gene spans 4,287 nucleotides and contains two exons.[4] This gene has multiple transcript variants that encode for two protein isoforms.[4]

Transcripts

The canonical FAM200B transcript is NM_001145191.2, which spans approximately 4.3 kb and consists of two exons with the second exon (bases 76–4287) comprising nearly the entire coding sequence.[4] There are multiple transcript variants for FAM200B, that encodes for two protein isoforms. See table 1.[4]

Table 1: Transcript and protein isoforms of the human FAM200B gene.[4]

Transcript Length (nt) Protein Length (aa) Isoform
NM_001145191.2 4,287 NP_001138663.1 657 MANE Select
XM_017008048.2 4,397 XP_016863537.1 657 X1
XM_024453999.2 3,822 XP_024309767.1 657 X1
XM_024454000.2 3,818 XP_024309768.1 657 X1
XM_024454001.2 3,942 XP_024309769.1 657 X1
XM_024454003.2 3,938 XP_024309771.1 657 X1
XM_024454005.2 3,803 XP_024309773.1 657 X1
XM_024454006.2 3,799 XP_024309774.1 657 X1
XM_024454008.2 3,749 XP_024309776.1 480 X2
XM_024454009.2 3,869 XP_024309777.1 480 X2
XM_024454010.2 3,979 XP_024309778.1 480 X2
XM_024454011.2 3,730 XP_024309779.1 480 X2
XM_047450103.1 4,464 XP_047306059.1 657 X1
XM_047450104.1 4,517 XP_047306060.1 657 X1
XM_047450106.1 4,342 XP_047306062.1 657 X1
XM_047450107.1 4,378 XP_047306063.1 657 X1
XM_047450108.1 3,889 XP_047306064.1 657 X1
XM_047450109.1 3,885 XP_047306065.1 657 X1
XM_047450110.1 4,480 XP_047306066.1 657 X1
XM_047450112.1 4,840 XP_047306068.1 657 X1
XM_047450113.1 3,816 XP_047306069.1 480 X2
XM_047450114.1 3,926 XP_047306070.1 480 X2
XM_047450115.1 3,859 XP_047306071.1 480 X2
XM_047450117.1 3,804 XP_047306073.1 480 X2
XM_054349762.1 4,559 XP_054205737.1 657 X1
XM_054349763.1 4,612 XP_054205738.1 657 X1
XM_054349764.1 4,394 XP_054205739.1 657 X1
XM_054349765.1 4,339 XP_054205740.1 657 X1
XM_054349766.1 4,375 XP_054205741.1 657 X1
XM_054349767.1 3,819 XP_054205742.1 657 X1
XM_054349768.1 3,815 XP_054205743.1 657 X1
XM_054349769.1 3,984 XP_054205744.1 657 X1
XM_054349770.1 3,980 XP_054205745.1 657 X1
XM_054349771.1 4,037 XP_054205746.1 657 X1
XM_054349772.1 4,033 XP_054205747.1 657 X1
XM_054349773.1 4,477 XP_054205748.1 657 X1
XM_054349774.1 4,837 XP_054205749.1 657 X1
XM_054349775.1 3,800 XP_054205750.1 657 X1
XM_054349776.1 3,796 XP_054205751.1 657 X1
XM_054349777.1 3,746 XP_054205752.1 480 X2
XM_054349778.1 3,911 XP_054205753.1 480 X2
XM_054349779.1 3,964 XP_054205754.1 480 X2
XM_054349780.1 4,074 XP_054205755.1 480 X2
XM_054349781.1 4,021 XP_054205756.1 480 X2
XM_054349782.1 3,856 XP_054205757.1 480 X2
XM_054349783.1 3,727 XP_054205758.1 480 X2
XM_054349784.1 3,801 XP_054205759.1 480 X2

Protein

Human FAM200B encodes two protein isoforms, a longer 657 amino acid isoform (Isoform X1) and a shorter 480 amino acid isoform (Isoform X2), see Table 1. The canonical transcript is NM_001145191.2, while the remaining variants are predicted models.[4] The predicted molecular weight is ~76.0 kDa [6] and approximate pI is 8.33.[4][7] Amino acid composition is enriched for Leu (~12.9%), Ser (~8.8%), Lys (~8.2%), and Glu (~8.1%).[7] FAM200B lacks low complexity regions, long tandem repeats, or significant charge clusters, with charged residues distributed evenly throughout the sequence. Only short, localized periodic motifs were detected, consistent with a soluble intracellular protein lacking large repetitive domains.[8] Although no experimentally validated domains have been defined, based on homology analyses across vertebrate ortho logs conserved C2H2- and BED type zinc finger motifs were identified in FAM200B.[9][10] Secondary structure analysis predicts FAM200B is a mixture of α-helices and β-strands, concentrated in a conserved central region of the protein.[11] Predicted tertiary structure suggest that FAM200B contains a mostly globular fold with a well structured core and more flexible N and C terminal regions.[3]

Gene level regulation

FAM200B is ubiquitously expressed moderately across human tissues and relatively higher expression in brain and thymus. Promoter analysis identified ETC and ETV5::FOXJ1 motif as high scoring transcription factor binding sites (with high scores of 511 and 436). Both transcription factors are known to function in neural development and brain related regulatory pathways, making them biologically plausible given the higher expression of FAM200B in brain tissue.[12]

Protein level regulation

FAM200B is a nuclear, soluble protein, with no signal peptide or transmembrane domains and no evidence of secretion or membrane insertion.[5][13] Post translational modification predictions indicate multiple serine, threonine, and tyrosine phosphorylation sites and multiple SUMOylation sites,[14] while no evidence for lipid anchor attachment, relevant glycosylation, or N terminal acetylation was identified.[15]

Homology

FAM200B is a vertebrate specific gene with the conserved paralog FAM200A, indicating a stable gene family structure across evolution. The two human para logs FAM200B and FAM200A have 79.79% sequence identity and both contain the conserved Domain of Unknown Function 4371 (DUF4371), supporting common evolutionary origin and functional similarity.[16][17] Comparative genome analysis shows that FAM200B ortho logs are present throughout vertebrates, including mammals, birds, amphibians, and bony fishes, with no clear homologs detected in invertebrate lineages, suggesting emergence during early vertebrate evolution.[4]

The earliest identifiable FAM200B ortho logs occur in A. ctinopterygii (ray-finned fish), indicating the gene originated prior to the divergence of bony fish and tetrapods approximately 420 - 450 million years ago. Across vertebrates, the number of family members has remained stable at two paralogs, although there are moderate differences observed in transcript length, exon composition, and alternative splicing patterns in distant orthologs. Despite this divergence, the overall sequence and conserved DUF4371 core are maintained.[4]

Table 2: 20 orthologs of the FAM200B protein in organisms including mammals (34-100% identity), birds/ reptiles (25-37% identity), amphibians (35-38% identity) and bony fish (37-43% identity).[4][18]

Clade Genus, Species Common Name Taxonomic Group Divergence Date (MYA) Accession Number Query Cover Sequnce Length (aa) Sequence Identity (%) Sequence Similarity (%)
Mammalia Homo Sapiens Human Primates 0 NP_001138663.1 100 657 100 100
1 Pan troglodytes Chimpanzee Apes 6.4 XP_001139775.1 100 573 99 99
2 Papio anubis Olive baboon Primates 28.8 XP_017814067.1 100 657 97 98
3 Canis lupus familiaris Dog Carnivora 94 XP_038335570.0 88 813 91 95
4 Monodelphis domestica Gray short-tailed opossum Marsupials 160 XP_056673701.1 75 748 34 54
5 Reptilia Natator depressus Flatback sea turtle Testudines 319 XP_074809886.1 91 655 34 56
8 Chelonia mydas Green sea turtle Testudines 319 XP_043379535.1 98 624 34 57
6 Aves Oxyura jamaicensis Ruddy duck Aves 319 XP_035169477.1 87 564 27 47
7 Caloenas nicobarica Nicobar pigeon Aves 319 XP_065484009.1 89 604 25 44
9 Amphibia Pleurodeles waltl Iberian ribbed newt Urodela 352 XP_069075336.1 92 617 38 59
10 Dendrobates tinctorius Poison dart frog Anura 352 XP_073431629.1 89 625 38 59
11 Ascaphus truei Tailed frog Anura 352 XP_075472991.1 100 638 37 57
12 Rhinatrema bivittatum Gymnophiona 352 XP_029452623.1 92 598 35 56
13 Osteichthyes Trichomycterus rosablanca Cave catfish Siluriformes 426 XP_062844886.1 91 614 43 64
14 Astyanax mexicanus Mexican tetra Characiformes 429 XP_049334409.1 91 598 42 64
15 Anoplopoma fimbria Sablefish Scorpaeniformes 429 XP_054473507.1 85 552 42 64
16 Eleginops maclovinus Patagonian bennie Perciformes 429 : XP_063763934.1 99 692 38 59
17 Centroberyx gerrardi Bright redfish Eryciformes 429 XP_071783535.1 91 544 38 58
18 Carassius auratus Goldfish Cypriniformes 429 XP_026126532.1 95 633 37 58
19 Triplophysa rosa Cypriniformes 429 XP_057204124.1 94 633 39 59
20 Megalobrama amblycephala Wuchang bream Cypriniformes 429 XP_048064547.1 85 551 42 63

Function

FAM200B encodes a conserved intracellular protein with an unknown function. Sequence and structural analyses indicate that FAM200B lacks catalytic motifs, signal peptides, and transmembrane domains. This suggests it does not function as an enzyme, secreted factor, or membrane protein. It's predicted nuclear localization, the presence of regulatory post translational modification sites (including phosphorylation and SUMOylation), and limited zinc finger like motifs support a role in regulatory processes, possibly involving protein protein interactions.

Interacting proteins

Interaction analysis identified limited biologically plausible binding partners for FAM200B, most notably ANKRD45 and C1orf198. ANKRD45 [19] contains ankyrin repeat domains that mediate protein protein interactions, supporting a role for FAM200B within regulatory complexes, while C1orf198[13][5] is an uncharacterized protein associated with nuclear and regulatory proteins, suggesting a protein complex relationship. Other predicted partners lack compatible localization or functional context. This indicates that FAM200B likely interacts as a nuclear regulatory protein that functions through protein - protein interactions within a protein complex.[20]

Clinical significance

FAM200B has no established association with human disease, and no pathogenic variants. However expression under specific cellular stressors, suggests that FAM200B may function as a modifier gene influencing strength, timing, or cellular context of disease related pathways.[4]

References

  1. ^ a b c GRCh38: Ensembl release 89: ENSG00000237765Ensembl, May 2017
  2. ^ "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  3. ^ a b c "AlphaFold Protein Structure Database". alphafold.ebi.ac.uk. Retrieved 2025-12-12.
  4. ^ a b c d e f g h i j k l m n o p "Homo sapiens family with sequence similarity 200 member B (FAM200B), mRNA". 2025-04-28.
  5. ^ a b c "DeepLoc 2.0 - DTU Health Tech - Bioinformatic Services". services.healthtech.dtu.dk. Retrieved 2025-12-12.
  6. ^ GeneCards Human Gene Database. "FAM200B Gene - GeneCards | F200B Protein | F200B Antibody". www.genecards.org. Archived from the original on 2022-12-05. Retrieved 2025-12-12.
  7. ^ a b "Expasy - ProtParam". web.expasy.org. Retrieved 2025-12-12.
  8. ^ Institute EB. "Job Dispatcher homepage | EMBL-EBI". www.ebi.ac.uk. Retrieved 2025-12-12.
  9. ^ "Expasy - PROSITE". prosite.expasy.org. Retrieved 2025-12-12.
  10. ^ "ScanProsite". prosite.expasy.org. Retrieved 2025-12-12.
  11. ^ "JPred: A Protein Secondary Structure Prediction Server". www.compbio.dundee.ac.uk. Retrieved 2025-12-12.
  12. ^ "Matrix profile: FAM200B - MA2588.1 - JASPAR". jaspar.elixir.no. Retrieved 2025-12-12.
  13. ^ a b "PSORT WWW Server". psort.hgc.jp. Retrieved 2025-12-12.
  14. ^ "SUMOplot™ Analysis Program | Abcepta". www.abcepta.com. Retrieved 2025-12-12.
  15. ^ "Bioinformatic Tools and Services - DTU Health Tech". services.healthtech.dtu.dk. Retrieved 2025-12-12.
  16. ^ "Pfam is now hosted by InterPro". pfam.xfam.org. Archived from the original on 2025-10-01. Retrieved 2025-12-12.
  17. ^ "Multiple Sequence Alignment - CLUSTALW". www.genome.jp. Retrieved 2025-12-12.
  18. ^ "TimeTree :: The Timescale of Life". timetree.org. Retrieved 2025-12-12.
  19. ^ Kang Y, Xie H, Zhao C (June 2019). "Ankrd45 Is a Novel Ankyrin Repeat Protein Required for Cell Proliferation". Genes. 10 (6): 462. doi:10.3390/genes10060462. PMC 6628321. PMID 31208154.
  20. ^ "STRING: functional protein association networks". string-db.org. Retrieved 2025-12-12.