Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Currently we enforce limits on the size of queries (based on input file size and the dataset being queried) to ensure that some queries don't clog up the server. We are working on hardware updates that will enable us to modify these restrictions.

Related pages:
NCBI BLAST limits.

SeqViewer

If you need to process lots of sequences you can create your own local install of BLAST and use the TAIR blast datasets.

What is the most recent version of the reference genome assembly?

...

Converting genetic locations to sequence locations will only give an approximate correlation. This is because the conversion depends upon both the genetic map used and the frequency of recombination which is variable within the genome. The most accurate estimate would obtained by comparing the Lister and Dean RI map to the genome sequence (AGI map) for a global genome comparison value. You can also visually align genetic and sequence based maps using the MapViewer tool, by aligning common markers from the genetic and sequence maps.MapViewer

How up to date is the Lister/Dean RI map data in the MapViewer?

...

You can now search for microarray experiments directly from the Microarray Elements Search. In addition, the Expression Viewer display now has direct links to the microarray experiment details. Click on the name of the hybridization in the Expression Viewer to display the information about the experiement.

...

The raw data files for microarray experiments are large and therefore have been compressed. To uncompress the microarray datafiles (with .gz suffix) do the following: For MAC/UNIX/LINUX from the command line in a terminal window type in gzip -d /home/yourname/yourpath/filename.gz . For example: gzip -d /home/frank/franksfiles/ciw_2000.gz. For PC's you should download the WinZip utility to decompress the files.

...

  • Use TAIRs BLAST to match the SNP sequence against the AGI whole genome or Genbank whole genome datasets.
  • Use the SeqViewer and
    • Paste the SNP/INDEL sequence into the input box.
    • Choose search by sequence.
    • Hits will be displayed as red lines on the chromosome bars.
    • Click on the chromosome bar to zoom in on the region.

...

Bulk

...

Downloading Data

Where

...

did the bulk download tools go?

Bulk download functions have been integrated into the Gene Search and Protein Search. See the Bulk Downloading Gene Data help document for specifics.

Where can I get the most up to date TAIR data sets?

If you are a subscriber the most up to date data is available through the website on locus and other detail pages as well as bulk download tools. We also prepare quarterly data releases which include up to date gene names,locus summaries, Gene Ontology annotations, Plant Ontology annotations, and publications. Subscribers can access the most recent releases here. Year old quarterly releases are available here  for non subscribed institutions/users.

...

Tab-delimited files from the bulk tools search results or Downloads site can easily be opened using a spreadsheet program such as Microsoft Excel or a text editor such as BBedit.

  • Open the file in the application and follow the instructions for choosing column delimiters (tabs) and column format (use text as a default for all).

How

...

Spreadsheet programs like Microsoft Excel allow you to save your file as tab delimited text. Excel spreadsheets (with the .xls extension) cannot be used to upload a list into the bulk search pages.

How can I obtain specific datasets such as can I obtain specific datasets such as all sequenced genes, or all markers in a defined region?

...

  • BLAST, PATMATCH datasets-AGI datasets and other datasets derived from the AGI sequences are updated after each major genome release. The current datasets reflects changes from the last TAIR release (TAIR10, November 2010Araport 11 (April 2016), the next release will be for Araport 11TAIR12.
  • Seed and DNA stocks from ABRC-variable.
  • Publications -downloaded weekly from PubMed.
  • Gene/Locus summary updates-weekly
  • Gene Ontology Annotations-weekly updates from TAIR curation, monthly from GO consortium
  • Polymorphisms and Phenotypes - weekly updates from TAIR literature curation

...

...

How do I submit data to TAIR?

See Data Submission section for instructions on how to submit Marker/Polymorphism, Gene Family, Functional Genomics Gene Lists and other data to TAIR.

...

Before I name my gene, how can I find out if a gene name is in use?

Consult the Gene Symbol List for a list of gene names that have been reserved and are not available. You can should also use the TAIR quicksearch to search for the name in the TAIR database or anywhere on the website (e.g. the list of Arabidopsis gene families)because the registry only contains a fraction of the names in use. UniProt is another source of curated gene names.

Related pages:
TAIR nomenclature guide

What is the difference between a hypothetical, unknown and putative protein?

Putative proteins are similar to a known gene. Unknown proteins are not similar to a known gene but do have EST or cDNA matches showing that they are expressed. Hypothetical proteins have no EST or cDNA matches and are not similar to a known gene, so there is no evidence that they are expressed genes.Related Pages:TIGR naming conventions


FAQs for Developers

Frequently asked questions for software developers and casual programmers

What are the proper procedures for scripting creating hyperlinks to TAIR database pages?

TAIR has many resources that users may access through automated retrieval programs instead of through a web browser. If you intend to use scripts to extract data from TAIR, please note the following:

  • Please contact us at curator@arabidopsis.org with your data requirements to see whether we can supply the data you need through a custom script directly against our database rather than submitting requests through scripts.
  • Do not overload TAIR's systems by submitting requests through multiple threads or programs. Flooding the server with requests can lead to many problems including a denial of service to others trying to use the website.
  • Run retrieval scripts during off hours such as weekends or between 9PM and 5AM Eastern Time on weekdays.
  • Make no more than one request every 3 seconds.
  • TAIR will block access from sites which overload the servers with requests with no warning.
  • TAIR features are under continuous development. URLs, query syntax and parameters may change without warning.
  • If you receive error messages or no results, please do not rerun your program, but contact us at curator@arabidopsis.org and we will assist you.

Related Pages:
Hyperlinking to TAIR