LA GENOMICA
GENOMICA
La genomica è una branca della biologia molecolare che si
occupa dello studio del genoma degli organismi viventi.
DI COSA SI OCCUPA LA GENOMICA?
In particolare si occupa della struttura,
contenuto, funzione ed evoluzione del genoma.
È una scienza che si basa sulla bioinformatica
per l'elaborazione e la visualizzazione dell'enorme
quantità di dati che produce.
GENOMICA
La genomica nacque negli anni 80, quando furono prese le prime
iniziative per il sequenziamento di interi genomi.
Una data di nascita si può probabilmente far coincidere con il
sequenziamento completo del primo genoma, nel 1980: si
trattava del genoma di un virus, il fago Φ-X174.
Il primo sequenziamento del genoma di un organismo vero e
proprio fu completato nel 1995 e si trattava di un batterio,
Hemophilus influenzae, con un genoma di notevoli dimensioni
(1,8 milioni di paia di basi.
Da allora i genomi "completati" aumentano esponenzialmente
(al 01-2008 > 700: 50 Archea, 575 procarioti, 77 eucarioti)
. La prima pianta il cui genoma è completamente noto nella sua
sequenza è stata Arabidopsis thaliana.
obiettivi della genomica
•mappe genetiche e fisiche del DNA degli organismi viventi,
mediante il suo completo sequenziamento.
•La sequenza del DNA viene poi annotata, ovvero vengono
identificati e segnalati tutti i geni e le altre porzioni di sequenza
significative, insieme a tutte le informazioni conosciute su tali
geni.
• Inserimento delle informazioni in appositi database,
accessibili via Internet (gratuitamente).
•genomica comparativa, che si occupa del confronto tra i
genomi di diversi organismi, nella loro organizzazione e
sequenza.
SEQUENZIAMENTO DEL DNA
Part of a radioactively labelled sequencing gel
[Chain-termination methods
.
The classical chain-termination or Sanger method requires a single-stranded DNA template, a DNA primer, a DNA
polymerase, radioactively or fluorescently labeled nucleotides, and modified nucleotides that terminate DNA strand
elongation. The DNA sample is divided into four separate sequencing reactions, containing the four standard
deoxynucleotides (dATP, dGTP, dCTP and dTTP) and the DNA polymerase. To each reaction is added only one of
the four dideoxynucleotides (ddATP, ddGTP, ddCTP, or ddTTP). These dideoxynucleotides are the chain-terminating
nucleotides, lacking a 3'-OH group required for the formation of a phosphodiester bond between two nucleotides
during DNA strand elongation. Incorporation of a dideoxynucleotide into the nascent (elongating) DNA strand
therefore terminates DNA strand extension, resulting in various DNA fragments of varying length. The
dideoxynucleotides are added at lower concentration than the standard deoxynucleotides to allow strand elongation
sufficient for sequence analysis.
Elettroforesi capillare
Esempio di output
Current methods can directly sequence only relatively short (300-1000 nucleotides long) DNA fragments in
a single reaction. [2]. The main obstacle to sequencing DNA fragments above this size limit is insufficient
power of separation for resolving large DNA fragments that differ in length by only one nucleotide
Strategia shot gun
High-throughput sequencing
The high demand for low cost sequencing has given rise to a number of highthroughput sequencing technologies.[15][16] These efforts have been funded by
public and private institutions as well as privately researched and commercialized
by biotechnology companies. High-throughput sequencing technologies are
intended to lower the cost of sequencing DNA libraries beyond what is possible
with the current dye-terminator method based on DNA separation by capillary
electrophoresis. Many of the new high-throughput methods use methods that
parallelize the sequencing process, producing thousands or millions of sequences
at once.
La genomica è stata affiancata più recentemente da nuove
branche della biologia ad essa affini per modalità di approccio
alla ricerca:
• Trascrittomica si occupa dell'espressione dei geni negli RNA messaggeri di un
intero organismo o di un particolare organo, tessuto o cellula in un particolare punto
dello sviluppo dell'organismo o sotto particolari condizioni ambientali, facendo
principalmente uso dei
microarrays
•Proteomica (elettroforesi 2D etc…)
•Metabolomica (gascromatografia, etc…)
•Metagenomica
MICROARRAY
A DNA microarray (also commonly known as gene or genome chip, DNA chip, or gene
array) is a collection of microscopic DNA spots, commonly representing single genes,
arrayed on a solid surface by covalent attachment to a chemical matrix. DNA arrays are
different from other types of microarray only in that they either measure DNA or use DNA as
part of its detection system. Qualitative or quantitative measurements with DNA microarrays
utilize the selective nature of DNA-DNA or DNA-RNA hybridization under high-stringency
conditions and fluorophore-based detection. DNA arrays are commonly used for expression
profiling, i.e., monitoring expression levels of thousands of genes simultaneously, or for
comparative genomic hybridization.
http://en.wikipedia.org/wi
ki/Image:Microarray_prin
ting.ogg
Microarray
Public databases of microarray data
Database
Microarray Experiment Sets
Sample Profiles
as of Date
Gene Expression Omnibus - NCBI
5366
134669
April 1, 2007
Stanford Microarray database
12742
?
April 1, 2007
UPenn RAD database
~100
~2500
Sept. 1, 2007
UNC Microarray database
~31
2093
April 1, 2007
MUSC database
~45
555
April 1, 2007
ArrayExpress at EBI
1643
136
April 1, 2007
caArray at NCI
41
1741
November 15, 2006
UPSC-BASE
~100
?
November 15
Online microarray data-analysis programs and tools
Several Open Directory Project categories list online microarray data analysis programs and tools:
•Bioinformatics : Online Services : Gene Expression and Regulation at the Open Directory Project
•Gene Expression : Databases at the Open Directory Project
•Gene Expression : Software at the Open Directory Project
•Data Mining : Tool Vendors at the Open Directory Project
•Bioconductor: open source and open development software project for the analysis and comprehension of genomic data
•Genevestigator : Web-based database and analysis tool to study gene expression across large sets of tissues, developmental
stages, drugs, stimuli, and genetic modifications.
•GeneCAT (Gene Co-expression Analysis Toolbox): Web-based database of gene expression data and expression analysis tools
for Arabidopsis thaliana and barley.
Phylogenetic profiles
The rationale behind phylogenetic profiles is that genes that
are involved in a given biological process tend to be either all
present or all absent, depending on whether that process is
active in the different organisms that are considered.
Therefore, genes that are functionally associated will tend
to have very similar phylogenetic profile