Homeobox

A homeobox is a, around 180 s long, found within s that are involved in the regulation of patterns of anatomical development in s, , s, and numerous single cell eukaryotes. These genes encode homeodomain products that are s sharing a characteristic  structure that binds.

is a term coined by to describe the outright replacement of a discrete body part with another body part, e.g. —replacement of the antenna on the head of a fruit fly with legs. The "homeo-" prefix in the words "homeobox" and "homeodomain" stems from this, which is frequently observed when these genes are mutated in.

Discovery
Homeoboxes were discovered independently in 1983 by, , and working in the lab of  at the , ; and by  and Amy Weiner, who were then working with  at  in. The existence of homeobox genes were first discovered in , where mutations in homeobox genes caused the radical alterations known as " transformations". One of the most famous such mutation is , in which legs grow from the head of a fly instead of the expected antennae.

Homeodomain proteins
A homeobox is about 180 DNA s long and encodes a that binds. The following shows the homeodomain (~60 amino acid residue chain):

Helix 1         Helix 2         Helix 3/4 ______________   __________    _________________ RRRKRTAYTRYQLLELEKEFHFNRYLTRRRRIELAHSLNLTERHIKIWFQNRRMKWKKEN ....|....|....|....|....|....|....|....|....|....|....|....|        10        20        30        40        50        60

Structure
The characteristic homeodomain consists of a 60- long domain composed of three alpha helixes. Helix 2 and helix 3 form a so-called (HTH) structure, where the two  are connected by a short loop region. The two helices of the homeodomain are  and the longer  helix is roughly perpendicular to the axes established by the first two. It is this third helix that interacts directly with via a number of hydrogen bonds and hydrophobic interactions, as well as indirect interactions via water molecules, which occur between specific s and the exposed s within the  of the DNA.

Homeodomain proteins are found in s. Through the HTH motif, they share limited sequence similarity and structural similarity to prokaryotic transcription factors, such as  proteins that alter the expression of genes in s. The HTH motif shows some sequence similarity but a similar structure in a wide range of DNA-binding proteins (e.g.,  and s, homeodomain proteins, etc.). One of the principal differences between HTH motifs in these different proteins arises from the stereo-chemical requirement for in the turn which is needed to avoid  interference of the beta-carbon with the main chain: for cro and repressor proteins the glycine appears to be mandatory, whereas for many of the homeotic and other DNA-binding proteins the requirement is relaxed.

Sequence specificity
Homeodomains can bind both specifically and nonspecifically to with the C-terminal recognition helix aligning in the DNA's major groove and the unstructured peptide "tail" at the N-terminus aligning in the minor groove. The recognition helix and the inter-helix loops are rich in and  residues, which form s to the DNA backbone;   residues in the center of the recognition helix aid in stabilizing the helix packing. Homeodomain proteins show a preference for the DNA sequence 5'-TAAT-3'; sequence-independent binding occurs with significantly lower affinity.

Biological function
Through the DNA-recognition properties of the homeodomain, homeoproteins are believed to regulate the expression of targeted genes and direct the formation of many body structures during early embryonic development. Many homeodomain proteins induce by initiating the cascades of coregulated genes required to produce individual  and s. Other proteins in the family, such as  are involved in maintaining. Homeobox genes are critical in the establishment of body axes during.

Homeoprotein transcription factors typically switch on cascades of other genes. The homeodomain binds DNA in a sequence-specific manner. However, the specificity of a single homeodomain protein is usually not enough to recognize only its desired target genes. Most of the time, homeodomain proteins act in the of their target genes as complexes with other transcription factors. Such complexes have a much higher target specificity than a single homeodomain protein. Homeodomains are encoded both by genes of the and by other genes throughout the genome.

The homeobox domain was first identified in a number of Drosophila and segmentation proteins, but is now known to be well-conserved in many other animals, including s.

Specific members of the Hox family have been implicated in vascular remodeling,, and disease by orchestrating changes in matrix degradation, integrins, and components of the ECM. HoxA5 is implicated in atherosclerosis. HoxD3 and HoxB3 are proinvasive, angiogenic genes that upregulate b3 and a5 integrins and Efna1 in ECs, respectively. HoxA3 induces cell (EC) migration by upregulating MMP14 and uPAR. Conversely, HoxD10 and HoxA5 have the opposite effect of suppressing EC migration and angiogenesis, and stabilizing adherens junctions by upregulating TIMP1/downregulating uPAR and MMP14, and by upregulating Tsp2/downregulating VEGFR2, Efna1, Hif1alpha and COX-2, respectively. HoxA5 also upregulates the tumor suppressor p53 and Akt1 by downregulation of PTEN. Suppression of HoxA5 has been shown to attenuate growth. HoxA5 has far-reaching effects on gene expression, causing ~300 genes to become upregulated upon its induction in breast cancer cell lines. HoxA5 protein transduction domain overexpression prevents inflammation shown by inhibition of TNFalpha-inducible monocyte binding to HUVECs.

Plant homeobox genes
As in animals, the plant homeobox genes code for the typical 60 amino acid long DNA-binding homeodomain or in case of the TALE (three amino acid loop extension) homeobox genes for an "atypical" homeodomain consisting of 63 amino acids. According to their conserved intron–exon structure and to unique codomain architectures they have been grouped into 14 distinct classes: HD-ZIP I to IV, BEL, KNOX, PLINC, WOX, PHD, DDT, NDX, LD, SAWADEE and PINTOX. Conservation of codomains suggests a common eukaryotic ancestry for TALE and non-TALE homeodomain proteins.

POU genes
Proteins containing a POU region consist of a homeodomain and a separate, POU domain that contains two  motifs and also binds DNA. The two domains are linked by a flexible loop that is long enough to stretch around the DNA helix, allowing the two domains to bind on opposite sides of the target DNA, collectively covering an eight-base segment with 5'-ATGCAAAT-3'. The individual domains of POU proteins bind DNA only weakly, but have strong sequence-specific affinity when linked. The POU domain itself has significant structural similarity with repressors expressed in s, particularly.

In humans, the POU genes are ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; and.

Hox genes
Hox genes are a subset of homeobox genes. They are essential n genes as they determine the identity of embryonic regions along the anterio-posterior axis. The first vertebrate Hox gene was isolated in  by Eddy De Robertis and colleagues in 1984, marking the beginning of the young science of ("evo-devo"). Mutations in these s cause displacement of organs.

In vertebrates, the four clusters are partially redundant in function, but have also acquired several derived functions. In particular, HoxA and HoxD specify segment identity along the axis.

The main interest in this set of genes stems from their unique behaviour. They are typically found in an organized cluster. The linear order of the genes within a cluster is directly correlated to the order of the regions they affect as well as the timing in which they are affected. This phenomenon is called colinearity. Due to this linear relationship, changes in the gene cluster due to mutations generally result in similar changes in the affected regions.

For example, when one gene is lost the segment develops into a more anterior one, while a mutation that leads to a gain of function causes a segment to develop into a more posterior one. This is called. Famous examples are and  in Drosophila, which can cause the development of legs instead of antennae and the development of a duplicated thorax, respectively.

Molecular evidence shows that some limited number of Hox genes have existed in the since before the earliest true, making these genes pre-.

Human homeobox genes
The Hox genes in humans are organized in four chromosomal clusters:

genes are analogously found in four areas. They include, , ; , ; and. Other genes considered Hox-like include, ; , ; , ; and. The NK-like (NKL) genes, some of which are considered "MetaHox", are grouped with Hox-like genes into a large ANTP-like group.

Humans have a :, , , , , and. Dlx genes are involved in the development of the nervous system and of limbs. They are considered a subset of the NK-like genes.

Human TALE (Three Amino acid Loop Extension) homeobox genes for an "atypical" homeodomain consist of 63 rather than 60 amino acids: ,, , , , ; , , ; ; , , , ; , ; , , ,.

In addition, humans have the following homeobox genes and proteins:
 * LIM-class:, ; , , , , , , , ; ,
 * POU-class:
 * CERS-class:, , , , ;
 * HNF-class: ;, ;
 * SINE-class:, , , , ,
 * CUT-class:, , ; , ; , ;
 * ZF-class:, ; , , ; , ; , , ; , ;
 * PRD-class: (CART1),, ; ; ; ; ; ; , , DUX ; ; , ; ; ; ; ; ; ; ; , , ; , , , , , , ; , ; , , ; ; , ; , ; , /; ; , ; ; ; ,
 * NKL-class:, ; , ; ; , ; , ; , ; ; ; , ; , ; ; ; , , ; , , ; , , ;
 * Nkx:, ; , ; , ; , , ; , , ; ; ; ;

Mutations
Mutations to homeobox genes can produce easily visible changes.

Two examples of homeobox mutations in the above-mentioned fruit fly are legs where the antennae should be, and a second pair of wings.

Duplication of homeobox genes can produce new body segments, and such duplications are likely to have been important in the of segmented animals. However, Hox genes typically determine the identity of body segments.

There is one family, the, in which both the  and mouthparts are remarkably leg-like in structure. This is not uncommon in arthropods as all arthropod appendages are homologous.

Regulation
Hox genes and their associated microRNAs are highly conserved developmental master regulators with tight tissue-specific, spatiotemporal control. These genes are known to be dysregulated in several cancers and are often controlled by DNA methylation. The regulation of Hox genes is highly complex and involves reciprocal interactions, mostly inhibitory. is known to use the and  Complexes to maintain the expression of Hox genes after the down-regulation of the pair-rule and gap genes that occurs during larval development. can silence the HOX genes by modulation of structure.

Evolution
It is accepted that the three major animal ANTP-class clusters, Hox, ParaHox, and NK (MetaHox), are the result of segmental duplications. A first duplication created MetaHox and ProtoHox, the latter of which later duplicated into Hox and ParaHox. The clusters themselves were created by tandem duplications of a single ANTP-class homeobox gene. Gene duplication followed by is responsible for the many homeobox genes found in eukaryotes.

The homeobox itself may have evolved from a non-DNA-binding transmembrane domain at the C-terminus of the MraY enzyme. This is based on metagenomic data acquired from the transitional archaeon, , that is regarded as the prokaryote closest to the ancestor of all eukaryotes.