The human tissue specific proteome

All, approximately 20000, human genes are classified according to their expression across a large number of tissues representing all major organs and tissue types in the human body. The genes with an elevated expression in a particular tissue are interesting as a starting point to understand the biology and function of this part of the human body, although only a few of these genes show a strict expression in a single tissue or organ.

  • A total of 11069 genes are elevated in at least one of the analyzed tissues of which:
  • 2845 are tissue enriched genes
  • 1637 are group enriched genes
  • 6587 are enhanced genes


Based on transcriptomics analysis across all major organs and tissue types in the human body, all putative 19670 protein coding genes have been classified with regard to specificity and distribution of transcribed mRNA molecules (Figure 1), including 8385 genes with low tissue specificity (read more in The housekeeping proteome) and 11069 proteins showing a significant elevated level of expression in a particular tissue or a group of related tissues.

Specificity illustrates the number of genes with elevated or non-elevated expression. Elevated expression includes three subcategory types:

  • Tissue enriched: At least four-fold higher mRNA level in a particular tissue compared to any other tissues.
  • Group enriched: At least four-fold higher average mRNA level in a group of 2-5 tissues compared to any other tissue.
  • Tissue enhanced: At least four-fold higher mRNA level in a particular tissue compared to the average level in all other tissues.

Distribution, on the other hand, visualizes how many genes that have, or do not have, detectable levels (NX≥1) of transcribed mRNA molecules. As evident in Table 1, all elevated genes are categorized as:

  • Detected in single: Detected in a single tissue
  • Detected in some: Detected in more than one but less than one third of tissues
  • Detected in many: Detected in at least a third but not all tissues
  • Detected in all: Detected in all tissues

A. Specificity

B. Distribution

Figure 1. (A) The distribution of all genes across the five categories based on transcript specificity in all 37 analyzed tissues. (B) The distribution of all genes across the six categories based on transcript detection (NX≥1) in all 37 analyzed tissues.


Table 1. Number of genes in the subdivided categories of elevated expression in all 37 analyzed tissues.

Distribution in the 37 tissues
Detected in singleDetected in someDetected in manyDetected in all Total
Specificity
Tissue enriched 5861363739157 2845
Group enriched 092264372 1637
Tissue enhanced 149110133711966 6587
Total 735338647532195 11069

The amount of tissue elevated genes is highly variable between the analyzed tissue types (see Table 2 below). Testis shows the largest number of tissue enriched genes (n=950), followed by brain (n=488) and liver (n=242). The large number of enriched genes in testis is considered to be due to the highly specialized processes occurring during spermatogenesis. Also, it is likely that many of these genes have a shared expression with oocytes in the female ovaries. However, oocytes are difficult to analyze because of the complex kinetics of female germ cell development, including first rounds of meiosis, which in females occur at the embryonic stage. Some tissues have similar functions and tissue morphology and as expected, tissue elevated genes in these tissues are predominantly group enriched genes exemplified by lymphoid tissues and the gastrointestinal tract.

In addition to previously known proteins, the analysis also identified a large number of genes with tissue elevated expression patterns that were previously poorly characterized and with no or only scarce evidence of existence at protein level. The combined RNA and antibody-based profiling can thus be used to confirm physiological functions of such protein coding genes lacking previous annotation. These proteins are interesting starting points for further in-depth studies to gain a better understanding of the molecular mechanisms of the various cellular phenotypes that define the function of each respective tissue and organ.


Table 2. Tissue elevated genes.

Tissue Tissue
enriched
Group
enriched
Tissue
enhanced
Total
elevated
Brain 488 496 1603 2587
Retina 87 79 144 310
Pituitary gland 26 111 216 353
Thyroid gland 13 32 154 199
Parathyroid gland 22 34 168 224
Adrenal gland 9 84 135 228
Lung 13 61 165 239
Salivary gland 42 78 199 319
Esophagus 5 57 249 311
Tongue 11 165 199 375
Stomach 17 52 90 159
Intestine 122 200 442 764
Liver 242 177 517 936
Gallbladder 3 41 117 161
Pancreas 64 93 265 422
Kidney 53 131 229 413
Urinary bladder 1 18 80 99
Testis 950 399 925 2274
Epididymis 77 104 231 412
Prostate 11 25 84 120
Seminal vesicle 2 36 142 180
Ductus deferens 0 34 81 115
Breast 19 51 117 187
Vagina 3 17 71 91
Cervix, uterine 0 27 104 131
Endometrium 2 14 69 85
Fallopian tube 6 75 231 312
Ovary 2 26 145 173
Placenta 91 99 304 494
Heart muscle 29 133 225 387
Skeletal muscle 111 202 594 907
Smooth muscle 0 14 98 112
Adipose tissue 2 50 160 212
Skin 113 125 309 547
Bone marrow 29 135 370 534
Lymphoid tissue 123 333 963 1419
Blood 57 397 944 1398
Total 2845 1637 6587 11069


Tissue enriched genes

The comprehensive analysis presented here has identified approximately 2845 human genes that display a tissue enriched expression pattern across the human body. A functional analysis revealed that the overall function of a tissue was highly associated with the function of the proteins encoded by the genes enriched in that tissue. The antibody-based protein profiling using immunohistochemistry allows for localization of the proteins corresponding to the different tissue enriched genes, and provides a precise map of protein expression in the various compartments and cell types that constitute different tissues and organs. Examples of tissue type specific proteins with a direct link to tissue function are presented below.

Brain

  • GFAP (Glial fibrillary acidic protein) - astrocyte intermediate filament protein
  • MBP (Myelin basic protein) - major constituent of the myelin sheath


GFAP - cerebral cortex

MBP - hippocampus

Retina

  • RHO (Rhodopsin) – involved in phototransduction in rod photoreceptors
  • ARR3 (Arrestin 3) – involved in phototransduction in cone photoreceptors


RHO - retina

ARR3 - retina

Pituitary gland

  • FSHB (Follicle stimulating hormone beta subunit) – hormone inducing egg and sperm production
  • TSHB (Thyroid stimulating hormone beta) – hormone regulating thyroid gland function


FSHB - pituitary gland

TSHB - pituitary gland

Kidney

  • SLC22A13 (Solute carrier family 22 member 13) - membrane bound organic anion transporter
  • NPHS2 (Podocin) - involved in the regulation of glomerular permeability


SLC22A13 - kidney

NPHS2 - kidney

Liver

  • ALB (Albumin) - plasma protein
  • CYP2A13 (Cytochrome P450 member) - involved in drug metabolism and cholesterol and steroid synthesis


ALB - liver

CYP2A13 - liver

Pancreas

  • AMY2A (Amylase, alpha 2A) - an enzyme that digests carbohydrates, secreted by exocrine cells
  • INS (Insulin) - involved in lowering of blood glucose, secreted by endocrine cells


AMY2A - pancreas

INS - pancreas

Male tissues

  • DMRT1 (Doublesex- and mab-3-related transcription factor 1) - testis enriched, involved in meiosis
  • PRM2 (Protamine 2) - important for spermatogenesis


DMRT1 - testis

PRM2 - testis

Female tissues

  • CSH1 (Chorionic somatomammotropin hormone 1 ) - hormone important for growth control during pregnancy
  • OVGP1 (Oviductal glycoprotein 1) - mucus protein important in mucociliary transport of the fertilized ovum


CSH1 - placenta

OVGP1 - fallopian tube

Skin

  • KRT1 (Keratin 1) - involved in squamous differentiation and skin barrier function
  • KRT27 (Keratin 27) - plays a role in hair formation


KRT1 - skin

KRT27 - hair


Group enriched proteins

The 1637 genes identified with a group enriched expression pattern reflect genes with shared expression in a limited number of tissues. The function of the corresponding proteins may be involved in various traits that can be shared between cell types located in different tissues and organs, such as proteins expressed in immune cells (various lymphoid tissues), proteins involved in squamous differentiation (esophagus and skin), glandular cell function in the gastrointestinal tract (duodenum, small intestine and colon) or cilia movement (testis and fallopian tube). The schematic network plot below shows the distribution between group enriched genes in different tissues.

Figure 3. An interactive network plot of the tissue enriched and group enriched genes connected to their respective enriched tissues (grey circles). Red nodes represent the number of tissue enriched genes and orange nodes represent the number of genes that are group enriched. The sizes of the red and orange nodes are related to the number of genes displayed within the node. Each node is clickable and results in a list of all enriched genes connected to the highlighted edges. The network is limited to group enriched genes in combinations of up to 3 tissues, but the resulting lists show the complete set of group enriched genes in the particular tissue.


Immune cells can be found in both lymphoid organs and organs infiltrated by immune cells, such as the intestine. Consequently, genes important for immune cell function are often enriched in both lymphoid tissues and the intestine. One such gene is MS4A1, which encodes CD20, an activated-glycosylated phosphoprotein expressed on the surface of B-cells beginning at the pro-B phase with progressively increasing concentrations until maturity.


MS4A1 - lymph node

MS4A1 - appendix

MS4A1 - small intestine

Squamous epithelium is found in many parts of the body as dry skin or wet mucosa, acting as a robust barrier against various chemical and mechanical stresses. Desmocollin 3, DSC3, coding for a protein important in cell-cell junctions and cellular adhesion, is group enriched in epithelium tissues, such as esophagus and skin exemplified below.


DSC3 - esophagus

DSC3 - skin

Mucus has several functions in the body related to transportation and barrier functions. The function of the mucus in the salivary gland is related to food and pathogens, while the mucus in the cervix is involved in for example transportation and blockage of sperm during sexual reproduction. MUC16 is a mucus component and is group enriched in both the mucus-producing salivary gland and cervix.


MUC16 - salivary gland

MUC16 - cervix

The fallopian tube shares many elevated genes with testis. The common denominator is the utilization of cilia, or the structurally similar flagellum, for essential organ functions. DNAI2, a dynein protein, constitute a motor protein component of motile cilia of multiciliated cells as well as the flagellum (tail) of the sperm. By pulling on the microtubule structure of the cilium/flagellum, the motor protein creates motion and in the case of the sperm, sperm motility. In the IHC images below, expression of DNAI2 can be seen in a subset of cilia in the fallopian tube (left and middle image), as well as in the flagellum of spermatids and cytoplasm of differentiating spermatocytes (right image).


DNAI2 - fallopian tube

DNAI2 - fallopian tube ciliated cells

DNAI2 - testis


Tissue enhanced genes

The category “tissue enhanced genes” is defined as genes that do not fulfill the criteria of tissue enriched but show a 5-fold higher TPM level in a specific tissue type compared to the average TPM value of all 37 analyzed tissue types.


Examples of protein expression

Below are examples of protein expression patterns of mainly known and well characterized elevated genes in the different parts of the body.

Brain


ELAVL3 - cerebral cortex

SLC17A7 - caudate

Endocrine tissues


TG - thyroid gland

TPO - thyroid gland


HSD3B2 - adrenal gland

PNMT - adrenal gland

Lung


SFTPA1 - lung

SFTPB - lung

Proximal digestive tract


STATH - salivary gland

KRT4 - esophagus

Gastrointestinal tract


PGA4 - stomach

DEFA5 - duodenum

FABP2 - duodenum

FABP6 - small intestine

KRT20 - colon

Pancreas


CPA2 - pancreas

GCG - pancreas

Liver & gallbladder


HP - liver

UGT2B4 - liver

CHST4 - gallbladder

Kidney & urinary bladder


SLC22A8 - kidney

UPK2 - urinary bladder

Male tissues


SEMG1 - seminal vesicle

KLK3 - prostate

Female tissues


SNTN - fallopian tube

MUM1L1 - ovary


PAEP - endometrium

GH2 - placenta

Muscle tissues


TNNI3 - heart muscle

TNNT2 - heart muscle

MYH7 - skeletal muscle

Adipose & soft tissue


FABP4 - adipose tissue (soft tissue)

PLIN1 - adipose tissue (breast)

Skin


CASP14 - skin

KRT82 - hair

Bone marrow


MPO - bone marrow

DEFA1 - bone marrow

ITGAM - bone marrow

Lymphoid tissue


CD8B - thymus

CD19 - appendix


CD72 - spleen

CD22 - lymph node


Table 3. Tissue specific scores and mRNA levels (measured as NX) are given for the above selected examples of tissue type elevated proteins.

Elevated tissue Gene Description Tissue specific
score
mRNA level
(NX)
Adipose tissue FABP4 fatty acid binding protein 4 9* 492.7
Adipose tissue PLIN1 perilipin 1 12* 187.5
Adrenal gland HSD3B2 hydroxy-delta-5-steroid dehydrogenase, 3 beta- and steroid delta-isomerase 2 5 223.2
Adrenal gland PNMT phenylethanolamine N-methyltransferase 5* 94.6
Bone marrow DEFA1 defensin alpha 1 7 825.5
Bone marrow ITGAM integrin subunit alpha M 0 93.7
Bone marrow MPO myeloperoxidase 15 315.2
Brain ELAVL3 ELAV like RNA binding protein 3 13 53.8
Brain GFAP glial fibrillary acidic protein 61 595.5
Brain MBP myelin basic protein 43 1297.7
Brain SLC17A7 solute carrier family 17 member 7 16 150.0
Mucus (Cervix) MUC16 mucin 16, cell surface associated 4* 19.3
Endometrium PAEP progestagen associated endometrial protein 9* 79.9
Esophagus KRT4 keratin 4 8* 637.8
Fallopian tube OVGP1 oviductal glycoprotein 1 7 240.9
Fallopian tube SNTN sentan, cilia apical structure protein 6* 73.9
Gallbladder CHST4 carbohydrate sulfotransferase 4 5 46.9
Heart muscle TNNI3 troponin I3, cardiac type 367 419.3
Heart muscle TNNT2 troponin T2, cardiac type 241 679.2
Duodenum (Intestine) DEFA5 defensin alpha 5 231 249.9
Duodenum (Intestine) FABP2 fatty acid binding protein 2 47 50.0
Small intestine (Intestine) FABP6 fatty acid binding protein 6 68 299.2
Colon (Intestine) KRT20 keratin 20 6 75.5
Kidney NPHS2 NPHS2, podocin 143 66.5
Kidney SLC22A13 solute carrier family 22 member 13 7 16.4
Kidney SLC22A8 solute carrier family 22 member 8 341* 127.3
Liver ALB albumin 81 3075.8
Liver CYP2A13 cytochrome P450 family 2 subfamily A member 13 20 30.7
Liver HP haptoglobin 37 1156.9
Liver UGT2B4 UDP glucuronosyltransferase family 2 member B4 105 175.1
Lung SFTPA1 surfactant protein A1 66 411.8
Lung SFTPB surfactant protein B 17 757.8
Appendix (Lymphoid tissue) CD19 CD19 molecule 6* 37.0
Lymph node (Lymphoid tissue) CD22 CD22 molecule 0 52.6
Spleen (Lymphoid tissue) CD72 CD72 molecule 13* 59.7
Thymus (Lymphoid tissue) CD8B CD8b molecule 5* 61.5
Tonsil (Lymphoid tissue) MS4A1 membrane spanning 4-domains A1 6* 134.1
Ovary MUM1L1 MUM1 like 1 0 79.4
Pancreas AMY2A amylase, alpha 2A (pancreatic) 31 1849.4
Pancreas CPA2 carboxypeptidase A2 529 1357.9
Pancreas GCG glucagon 16 202.8
Pancreas INS insulin 193 514.0
Pituitary gland FSHB follicle stimulating hormone beta subunit 90 70.7
Pituitary gland TSHB thyroid stimulating hormone beta 77 192.4
Placenta CSH1 chorionic somatomammotropin hormone 1 29 450.4
Placenta GH2 growth hormone 2 309* 138.4
Prostate KLK3 kallikrein related peptidase 3 115 447.4
Retina ARR3 arrestin 3 11 62.9
Retina RHO rhodopsin 151 415.0
Salivary gland STATH statherin 162 5940.6
Seminal vesicle SEMG1 semenogelin I 26* 1931.9
Skeletal muscle MYH7 myosin heavy chain 7 6* 596.4
Skin CASP14 caspase 14 6 109.4
Squamous epithelium (Skin) DSC3 desmocollin 3 5* 99.1
Skin KRT1 keratin 1 5 606.8
Hair (Skin) KRT27 keratin 27 8 37.4
Hair (Skin) KRT82 keratin 82 9 14.9
Stomach PGA4 pepsinogen 4, group I (pepsinogen A) 120 924.0
Testis DMRT1 doublesex and mab-3 related transcription factor 1 81 44.1
Cilium/Flagellum (Testis) DNAI2 dynein axonemal intermediate chain 2 5* 47.2
Testis PRM2 protamine 2 99 699.5
Thyroid gland TG thyroglobulin 247 994.3
Thyroid gland TPO thyroid peroxidase 58 411.3
Urinary bladder UPK2 uroplakin 2 6* 52.4
* group enriched score for tissue types with similar function and morphology


Relevant links and publications

Uhlén M et al, 2015. Tissue-based map of the human proteome. Science
PubMed: 25613900 DOI: 10.1126/science.1260419

Bergman J et al, 2016. The human adrenal gland proteome defined by transcriptomics and antibody-based profiling. Endocrinology.
PubMed: 27901589 DOI: 10.1210/en.2016-1758

Edqvist PH et al, 2015. Expression of human skin-specific genes defined by transcriptomics and antibody-based profiling. J Histochem Cytochem.
PubMed: 25411189 DOI: 10.1369/0022155414562646

Lindskog C et al, 2015. The human cardiac and skeletal muscle proteomes defined by transcriptomics and antibody-based profiling. BMC Genomics.
PubMed: 26109061 DOI: 10.1186/s12864-015-1686-y

Sjöstedt E et al, 2015. Defining the Human Brain Proteome Using Transcriptomics and Antibody-Based Profiling with a Focus on the Cerebral Cortex. PLoS One.
PubMed: 26076492 DOI: 10.1371/journal.pone.0130028

Zieba A et al, 2015. The Human Endometrium-Specific Proteome Defined by Transcriptomics and Antibody-Based Profiling. OMICS.
PubMed: 26488136 DOI: 10.1089/omi.2015.0115

O'Hurley G et al, 2015. Analysis of the Human Prostate-Specific Proteome Defined by Transcriptomics and Antibody-Based Profiling Identifies TMEM79 and ACOXL as Two Putative, Diagnostic Markers in Prostate Cancer. PLoS One.
PubMed: 26237329 DOI: 10.1371/journal.pone.0133449

Andersson S et al, 2014. The transcriptomic and proteomic landscapes of bone marrow and secondary lymphoid tissues. PLoS One.
PubMed: 25541736 DOI: 10.1371/journal.pone.0115911

Habuka M et al, 2014. The kidney transcriptome and proteome defined by transcriptomics and antibody-based profiling. PLoS One.
PubMed: 25551756 DOI: 10.1371/journal.pone.0116125

Mardinoglu A et al, 2014. Defining the Human Adipose Tissue Proteome To Reveal Metabolic Alterations in Obesity. J Proteome Res.
PubMed: 25219818 DOI: 10.1021/pr500586e

Kampf C et al, 2014. Defining the human gallbladder proteome by transcriptomics and affinity proteomics. Proteomics.
PubMed: 25175928 DOI: 10.1002/pmic.201400201

Lindskog C et al, 2014. The lung-specific proteome defined by integration of transcriptomics and antibody-based profiling. FASEB J.
PubMed: 25169055 DOI: 10.1096/fj.14-254862

Gremel G et al, 2014. The human gastrointestinal tract-specific transcriptome and proteome as defined by RNA sequencing and antibody-based profiling. J Gastroenterol.
PubMed: 24789573 DOI: 10.1007/s00535-014-0958-7

Kampf C et al, 2014. The human liver-specific proteome defined by transcriptomics and antibody-based profiling. FASEB J.
PubMed: 24648543 DOI: 10.1096/fj.14-250555

Djureinovic D et al, 2014. The human testis-specific proteome defined by transcriptomics and antibody-based profiling. Mol Hum Reprod.
PubMed: 24598113 DOI: 10.1093/molehr/gau018

Fagerberg L et al, 2014. Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics. Mol Cell Proteomics.
PubMed: 24309898 DOI: 10.1074/mcp.M113.035600

Danielsson A et al, 2014. The human pancreas proteome defined by transcriptomics and antibody-based profiling. PLoS One.
PubMed: 25546435 DOI: 10.1371/journal.pone.0115421

Microscopical images of normal tissue - Tissue Dictionary (Human Protein Atlas)

RNAseq atlas

GTEx Portal

Fantom

Uniprot

BioGPS

Allen Brain Atlas