The human tissue specific proteome
All, approximately 20000, human genes are classified according to their expression across a large number of tissues representing all major organs and tissue types in the human body. The genes with an elevated expression in a particular tissue are interesting as a starting point to understand the biology and function of this part of the human body, although only a few of these genes show a strict expression in a single tissue or organ.
- A total of 11069 genes are elevated in at least one of the analyzed tissues of which:
- 2845 are tissue enriched genes
- 1637 are group enriched genes
- 6587 are enhanced genes
Based on transcriptomics analysis across all major organs and tissue types in the human body, all putative 19670 protein coding genes have been classified with regard to specificity and distribution of transcribed mRNA molecules (Figure 1), including 8385 genes with low tissue specificity (read more in The housekeeping proteome) and 11069 proteins showing a significant elevated level of expression in a particular tissue or a group of related tissues.
Specificity illustrates the number of genes with elevated or non-elevated expression. Elevated expression includes three subcategory types:
- Tissue enriched: At least four-fold higher mRNA level in a particular tissue compared to any other tissues.
- Group enriched: At least four-fold higher average mRNA level in a group of 2-5 tissues compared to any other tissue.
- Tissue enhanced: At least four-fold higher mRNA level in a particular tissue compared to the average level in all other tissues.
Distribution, on the other hand, visualizes how many genes that have, or do not have, detectable levels (NX≥1) of transcribed mRNA molecules. As evident in Table 1, all elevated genes are categorized as:
- Detected in single: Detected in a single tissue
- Detected in some: Detected in more than one but less than one third of tissues
- Detected in many: Detected in at least a third but not all tissues
- Detected in all: Detected in all tissues
Figure 1. (A) The distribution of all genes across the five categories based on transcript specificity in all 37 analyzed tissues. (B) The distribution of all genes across the six categories based on transcript detection (NX≥1) in all 37 analyzed tissues.
Table 1. Number of genes in the subdivided categories of elevated expression in all 37 analyzed tissues.
The amount of tissue elevated genes is highly variable between the analyzed tissue types (see Table 2 below). Testis shows the largest number of tissue enriched genes (n=950), followed by brain (n=488) and liver (n=242). The large number of enriched genes in testis is considered to be due to the highly specialized processes occurring during spermatogenesis. Also, it is likely that many of these genes have a shared expression with oocytes in the female ovaries. However, oocytes are difficult to analyze because of the complex kinetics of female germ cell development, including first rounds of meiosis, which in females occur at the embryonic stage. Some tissues have similar functions and tissue morphology and as expected, tissue elevated genes in these tissues are predominantly group enriched genes exemplified by lymphoid tissues and the gastrointestinal tract.
In addition to previously known proteins, the analysis also identified a large number of genes with tissue elevated expression patterns that were previously poorly characterized and with no or only scarce evidence of existence at protein level. The combined RNA and antibody-based profiling can thus be used to confirm physiological functions of such protein coding genes lacking previous annotation. These proteins are interesting starting points for further in-depth studies to gain a better understanding of the molecular mechanisms of the various cellular phenotypes that define the function of each respective tissue and organ.
Table 2. Tissue elevated genes.
Tissue enriched genes
The comprehensive analysis presented here has identified approximately 2845 human genes that display a tissue enriched expression pattern across the human body. A functional analysis revealed that the overall function of a tissue was highly associated with the function of the proteins encoded by the genes enriched in that tissue. The antibody-based protein profiling using immunohistochemistry allows for localization of the proteins corresponding to the different tissue enriched genes, and provides a precise map of protein expression in the various compartments and cell types that constitute different tissues and organs. Examples of tissue type specific proteins with a direct link to tissue function are presented below.
Brain
- GFAP (Glial fibrillary acidic protein) - astrocyte intermediate filament protein
- MBP (Myelin basic protein) - major constituent of the myelin sheath
GFAP - cerebral cortex
MBP - hippocampus
Retina
- RHO (Rhodopsin) – involved in phototransduction in rod photoreceptors
- ARR3 (Arrestin 3) – involved in phototransduction in cone photoreceptors
RHO - retina
ARR3 - retina
Pituitary gland
- FSHB (Follicle stimulating hormone beta subunit) – hormone inducing egg and sperm production
- TSHB (Thyroid stimulating hormone beta) – hormone regulating thyroid gland function
FSHB - pituitary gland
TSHB - pituitary gland
Kidney
- SLC22A13 (Solute carrier family 22 member 13) - membrane bound organic anion transporter
- NPHS2 (Podocin) - involved in the regulation of glomerular permeability
SLC22A13 - kidney
NPHS2 - kidney
Liver
- ALB (Albumin) - plasma protein
- CYP2A13 (Cytochrome P450 member) - involved in drug metabolism and cholesterol and steroid synthesis
ALB - liver
CYP2A13 - liver
Pancreas
- AMY2A (Amylase, alpha 2A) - an enzyme that digests carbohydrates, secreted by exocrine cells
- INS (Insulin) - involved in lowering of blood glucose, secreted by endocrine cells
AMY2A - pancreas
INS - pancreas
Male tissues
- DMRT1 (Doublesex- and mab-3-related transcription factor 1) - testis enriched, involved in meiosis
- PRM2 (Protamine 2) - important for spermatogenesis
DMRT1 - testis
PRM2 - testis
Female tissues
- CSH1 (Chorionic somatomammotropin hormone 1 ) - hormone important for growth control during pregnancy
- OVGP1 (Oviductal glycoprotein 1) - mucus protein important in mucociliary transport of the fertilized ovum
CSH1 - placenta
OVGP1 - fallopian tube
Skin
- KRT1 (Keratin 1) - involved in squamous differentiation and skin barrier function
- KRT27 (Keratin 27) - plays a role in hair formation
KRT1 - skin
KRT27 - hair
Group enriched proteins
The 1637 genes identified with a group enriched expression pattern reflect genes with shared expression in a limited number of tissues. The function of the corresponding proteins may be involved in various traits that can be shared between cell types located in different tissues and organs, such as proteins expressed in immune cells (various lymphoid tissues), proteins involved in squamous differentiation (esophagus and skin), glandular cell function in the gastrointestinal tract (duodenum, small intestine and colon) or cilia movement (testis and fallopian tube). The schematic network plot below shows the distribution between group enriched genes in different tissues.
Figure 3. An interactive network plot of the tissue enriched and group enriched genes connected to their respective enriched tissues (grey circles). Red nodes represent the number of tissue enriched genes and orange nodes represent the number of genes that are group enriched. The sizes of the red and orange nodes are related to the number of genes displayed within the node. Each node is clickable and results in a list of all enriched genes connected to the highlighted edges. The network is limited to group enriched genes in combinations of up to 3 tissues, but the resulting lists show the complete set of group enriched genes in the particular tissue.
Immune cells can be found in both lymphoid organs and organs infiltrated by immune cells, such as the intestine. Consequently, genes important for immune cell function are often enriched in both lymphoid tissues and the intestine. One such gene is MS4A1, which encodes CD20, an activated-glycosylated phosphoprotein expressed on the surface of B-cells beginning at the pro-B phase with progressively increasing concentrations until maturity.
MS4A1 - lymph node
MS4A1 - appendix
MS4A1 - small intestine
Squamous epithelium is found in many parts of the body as dry skin or wet mucosa, acting as a robust barrier against various chemical and mechanical stresses. Desmocollin 3, DSC3, coding for a protein important in cell-cell junctions and cellular adhesion, is group enriched in epithelium tissues, such as esophagus and skin exemplified below.
DSC3 - esophagus
DSC3 - skin
Mucus has several functions in the body related to transportation and barrier functions. The function of the mucus in the salivary gland is related to food and pathogens, while the mucus in the cervix is involved in for example transportation and blockage of sperm during sexual reproduction. MUC16 is a mucus component and is group enriched in both the mucus-producing salivary gland and cervix.
MUC16 - salivary gland
MUC16 - cervix
The fallopian tube shares many elevated genes with testis. The common denominator is the utilization of cilia, or the structurally similar flagellum, for essential organ functions. DNAI2, a dynein protein, constitute a motor protein component of motile cilia of multiciliated cells as well as the flagellum (tail) of the sperm. By pulling on the microtubule structure of the cilium/flagellum, the motor protein creates motion and in the case of the sperm, sperm motility. In the IHC images below, expression of DNAI2 can be seen in a subset of cilia in the fallopian tube (left and middle image), as well as in the flagellum of spermatids and cytoplasm of differentiating spermatocytes (right image).
DNAI2 - fallopian tube
DNAI2 - fallopian tube ciliated cells
DNAI2 - testis
Tissue enhanced genes
The category “tissue enhanced genes” is defined as genes that do not fulfill the criteria of tissue enriched but show a 5-fold higher TPM level in a specific tissue type compared to the average TPM value of all 37 analyzed tissue types.
Examples of protein expression
Below are examples of protein expression patterns of mainly known and well characterized elevated genes in the different parts of the body.
Brain
ELAVL3 - cerebral cortex
SLC17A7 - caudate
Endocrine tissues
TG - thyroid gland
TPO - thyroid gland
HSD3B2 - adrenal gland
PNMT - adrenal gland
Lung
SFTPA1 - lung
SFTPB - lung
Proximal digestive tract
STATH - salivary gland
KRT4 - esophagus
Gastrointestinal tract
PGA4 - stomach
DEFA5 - duodenum
FABP2 - duodenum
FABP6 - small intestine
KRT20 - colon
Pancreas
CPA2 - pancreas
GCG - pancreas
Liver & gallbladder
HP - liver
UGT2B4 - liver
CHST4 - gallbladder
Kidney & urinary bladder
SLC22A8 - kidney
UPK2 - urinary bladder
Male tissues
SEMG1 - seminal vesicle
KLK3 - prostate
Female tissues
SNTN - fallopian tube
MUM1L1 - ovary
PAEP - endometrium
GH2 - placenta
Muscle tissues
TNNI3 - heart muscle
TNNT2 - heart muscle
MYH7 - skeletal muscle
Adipose & soft tissue
FABP4 - adipose tissue (soft tissue)
PLIN1 - adipose tissue (breast)
Skin
CASP14 - skin
KRT82 - hair
Bone marrow
MPO - bone marrow
DEFA1 - bone marrow
ITGAM - bone marrow
Lymphoid tissue
CD8B - thymus
CD19 - appendix
CD72 - spleen
CD22 - lymph node
Table 3. Tissue specific scores and mRNA levels (measured as NX) are given for the above selected examples of tissue type elevated proteins.
Relevant links and publications
Uhlén M et al, 2015. Tissue-based map of the human proteome. Science
PubMed: 25613900 DOI: 10.1126/science.1260419
Bergman J et al, 2016. The human adrenal gland proteome defined by transcriptomics and antibody-based profiling. Endocrinology.
PubMed: 27901589 DOI: 10.1210/en.2016-1758
Edqvist PH et al, 2015. Expression of human skin-specific genes defined by transcriptomics and antibody-based profiling. J Histochem Cytochem.
PubMed: 25411189 DOI: 10.1369/0022155414562646
Lindskog C et al, 2015. The human cardiac and skeletal muscle proteomes defined by transcriptomics and antibody-based profiling. BMC Genomics.
PubMed: 26109061 DOI: 10.1186/s12864-015-1686-y
Sjöstedt E et al, 2015. Defining the Human Brain Proteome Using Transcriptomics and Antibody-Based Profiling with a Focus on the Cerebral Cortex. PLoS One.
PubMed: 26076492 DOI: 10.1371/journal.pone.0130028
Zieba A et al, 2015. The Human Endometrium-Specific Proteome Defined by Transcriptomics and Antibody-Based Profiling. OMICS.
PubMed: 26488136 DOI: 10.1089/omi.2015.0115
O'Hurley G et al, 2015. Analysis of the Human Prostate-Specific Proteome Defined by Transcriptomics and Antibody-Based Profiling Identifies TMEM79 and ACOXL as Two Putative, Diagnostic Markers in Prostate Cancer. PLoS One.
PubMed: 26237329 DOI: 10.1371/journal.pone.0133449
Andersson S et al, 2014. The transcriptomic and proteomic landscapes of bone marrow and secondary lymphoid tissues. PLoS One.
PubMed: 25541736 DOI: 10.1371/journal.pone.0115911
Habuka M et al, 2014. The kidney transcriptome and proteome defined by transcriptomics and antibody-based profiling. PLoS One.
PubMed: 25551756 DOI: 10.1371/journal.pone.0116125
Mardinoglu A et al, 2014. Defining the Human Adipose Tissue Proteome To Reveal Metabolic Alterations in Obesity. J Proteome Res.
PubMed: 25219818 DOI: 10.1021/pr500586e
Kampf C et al, 2014. Defining the human gallbladder proteome by transcriptomics and affinity proteomics. Proteomics.
PubMed: 25175928 DOI: 10.1002/pmic.201400201
Lindskog C et al, 2014. The lung-specific proteome defined by integration of transcriptomics and antibody-based profiling. FASEB J.
PubMed: 25169055 DOI: 10.1096/fj.14-254862
Gremel G et al, 2014. The human gastrointestinal tract-specific transcriptome and proteome as defined by RNA sequencing and antibody-based profiling. J Gastroenterol.
PubMed: 24789573 DOI: 10.1007/s00535-014-0958-7
Kampf C et al, 2014. The human liver-specific proteome defined by transcriptomics and antibody-based profiling. FASEB J.
PubMed: 24648543 DOI: 10.1096/fj.14-250555
Djureinovic D et al, 2014. The human testis-specific proteome defined by transcriptomics and antibody-based profiling. Mol Hum Reprod.
PubMed: 24598113 DOI: 10.1093/molehr/gau018
Fagerberg L et al, 2014. Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics. Mol Cell Proteomics.
PubMed: 24309898 DOI: 10.1074/mcp.M113.035600
Danielsson A et al, 2014. The human pancreas proteome defined by transcriptomics and antibody-based profiling. PLoS One.
PubMed: 25546435 DOI: 10.1371/journal.pone.0115421
Microscopical images of normal tissue - Tissue Dictionary (Human Protein Atlas)
RNAseq atlas
GTEx Portal
Fantom
Uniprot
BioGPS
Allen Brain Atlas