How to use it
A general idea
Related Papers
The Webserver
What's New?
People involved
How to find us

Project Overview

HAP takes as input a set of genotypes over a region, taken from a population, and returns the haplotype phase of each of the individuals' genotypes. From our studies, we observed that HAP is very accurate when the number of individuals taken is at least a couple of dozens. The public version of HAP currently works with unrelated individuals, but soon an updated version would be added, where mother, father and child trios are allowed as part of the input.

In addition to phasing, HAP also produces a partition of the region into blocks of correlated SNPs. The block partition of the haplotypes is such that it minimizes the number of tag SNPs.

HAP also provides for each block the results of some statistical tests that show the correlation between a haplotype and a given phenotype.

HAP leverages a new insight into the underlying structure of haplotypes which shows that SNPs are organized in highly correlated ``blocks'' (Daly et al., 01', Patil et al., 01').

HAP has shown to have competitive accuracy compared to state of the art softwares (such as PHASE, HAPLOTYPER). On the other hand, HAP is extremely fast and can be used on very large data sets.

Genotypes, Haplotypes and SNPs - what are they?

Critical to the understanding of the genetic basis for complex diseases is the modeling of human variation. Most of this variation can be characterized by single nucleotide polymorphisms (SNPs) which are mutations at a single nucleotide position. Currently, the human genome project provided the genome sequence from a small set of individuals. Clearly, in order to fully understand the functions of the different parts of the genomes, we need to understand better the way the genomes differ from one individual to another.

Each person's genome contains two copies of each chromosome, one inherited from the father and the other from the mother. A person's genotype specifies the pair of bases at each site, but does not specify which base occurs on which chromosome. The sequence of each chromosome separately is called a haplotype. The determination of the haplotypes within a population is essential for understanding genetic variation and the inheritance of complex diseases.

Web Server