U-Probes : Thomas Lab Scripts

Universal Probes

Home
U-Probes
    Apes & OWM
    NWM
    Simians
    All Primates
    All Mammals
    Carnivores
    Rodents
    Marsupials
    Birds & Reptiles
    Advanced
    Custom Design
Downloads
Results
Links
Contact / Help

Thomas Lab Script Library

How to Use Nsoop Scripts
Nsoop (October 2004) Build 2 Scripts
Build 1 Scripts
Miscellaneous Scripts
All Scripts
Top
Download File Description
nsoop_v2_scripts.tar All scripts used in nsoop process. Includes functions and modules required by nsoop, and scripts used for validation of the candidate probes that are produced by nsoop. Nsoop_v2 (version 2) has improved accumulation of scores.
nsoop_control.tar Examples of small secondary files that are required by nsoop, including a 'newick' format tree file.
nsoop_scripts.tar All scripts used in nsoop process. Includes functions and modules required by nsoop, and scripts used for validation of the candidate probes that are produced by nsoop.
How to Use Build 2 Scripts
Build 2 (October 2003) Nsoop Scripts
Build 1 Scripts
Miscellaneous Scripts
All Scripts
Top
Script Description Sample Input Sample Output
qsub_generate_probes.pl Run multi_soop.pl once for each input file in current directory.
multi_soop.pl Generate probes from a multi-species alignment file (.maf) using scoring parameters from configuration file.
chr25.maf score.cfg chr25.xml
create_megablast_files.sh Run create_megablast_file.pl once for each input file in current directory.
create_megablast_file.pl From one XML probe file prepare one or more FASTA format files ready for megablast. Each FASTA file will contain maximum of N probes (default 3000).
chr25.xml chr25.xml.1
qsub_megablast.pl Run mega_blast once for each input file in current directory.
chr25.xml.1 chr25.xml.1.mega_out
qsub_megablast_validation.pl For each input file in current directory, sort the file then run validate_megablast_output.pl
chr25.xml.1.mega_out chr25.xml.1.valid
validate_megablast_output.pl Evaluate a file of sorted megablast output to characterize each probe as unique or non-unique.
qsub_separate_probes.pl Run separate_all_probes.pl once for each chromosome represented in XML input files.
separate_all_probes.pl For one chromosome, match validation results with full XML probe information to produce two output file, one limited to unique probes and the other to non-unique probes.
chr25.xml.1.valid chr25.probes.xml.valid chr25.probes.xml.invalid
load_probes.pl Insert probes from file into database.
chr25.probes.xml.valid chr25.probes.xml.invalid
How to Use Build 1 Scripts
Build 1 (August 2003) Nsoop Scripts
Build 2 Scripts
Miscellaneous Scripts
All Scripts
Top
Script Description
separate_probes.pl This script will take an XML probe file and based upon the results
from megablast, will seperate the probes in to a XML.VALID or
XML.INVALID
convert_axt.pl
sooper_xml_v2.pl This script will take standard input formats, along with AXTE.
It will generate probes in XML format.
Use of ungapped alignments is improved in v2 (version 2).
sooper_xml.pl This script will take standard input formats, along with AXTE.
It will generate probes in XML format.
validate_megablast_output.pl This script will take the output from megablast and create an output
file that will denote if the probe was either unique or non-unique.
process_megablast_output.sh This script takes the mega_blast output, sorts it, and then validates
the results to make sure that the probes are unique
(using validate_megablast_output.pl)
create_megablast_file.pl This script will take a XML probe file, and create a FASTA formated
file that is ready for megablast. The scripts outputs several files
with the same basename, each file containing N probes (default 3000)
Miscellaneous Scripts Nsoop Scripts
Build 1 Scripts
Build 2 Scripts
All Scripts
Top
Script Description
add_results_to_hybridization.pl This script will take a html formated file containing probe results
and add the results to the hybridization file.
analyse_hybridization.pl This script takes a probe summary text file, and can compute several
different types calculations, such as melting temp, number mismatches,
and longest identical string
analyse_xml_probes.pl This script takes a XML hybridization text file, and can compute several
different types calculations, such as melting temp, number mismatches,
and longest identical string
blastz.sh This script will go through all Targets in a given directory
and will blastz the human and species sequences together
blastz_index.sh This script is used for the chicken generation. It will go in to a target
and will single converage the blastz file, and create a blastz index file.
combine.pl This script will take a XML probe document, and will sort the
probes based upon start bp location.
combine_runs.sh This script will combine the 3 probe generation runs in to
XML file. This is done on both the valid and invalid files
count_probes.sh This script is used to count the number of probes through
the probe generation process to make sure a probe is not
dropped during processing
create_consolidated_megablast_files.sh This file takes all *.sorted files and calls create_megeblast_file.pl
create_megablast_files_invalid_sorted.sh This script will create megablast ready files for all *.invalid.sorted files
create_megablast_files_valid_sorted.sh This script will create megablast ready files for all *.valid.sorted
create_probe_summary.pl This file takes an html file and creates a probe_summary.txt file
fix_blastz.sh Some blastz files had ^M characters that needed to be removed
This script removes the characters.
fix_reprocess.sh This script will create a recreate the initial reprocess file for
archive purposes.
fix_sorted_xml.sh This script fixes the lack of an XML header and footer for *.sorted
by adding them both
fix_xml_files.sh This script fixes the lack of the XML header and footer for *.xml
files by adding them.
hybridize.pl This script will preform a computation hybridization with a given XML file
index_blastz.pl This will create an index to the blsatz file for quick searching.
The index will be printed to stdout.
lat.sh This script passes the args to the lat.jar program
lat_all.sh This script will traverse through all TargetX directories and
lat all the single coverage blastz files
locate_bp.pl This script will take a XML file of probes, and look up the location in the human
the location of the probes.
name_probes.pl This script will assign a "temp" name to each probe in the XML file based
upon the base name given on the command line.
probe_count.sh This script will count the number of probes in each mismatch group (0-4)
in an XML file.
qsub_megablast_fix.pl
qsub_seperate_probes.pl
rank_probes.pl This script will count the number of probes for each species category.
reprocess.sh This script will read the reprocess files and generate new probes
from the areas that generated non-unique probes.
reprocess_genome.sh This script is used for reprocessing, it will take all the *.mini
files and generate probes from them.
scrubber.pl This program will take either a Fasta formated file, or a PipMaker output file,
and will replace the repeative sequence (lower case) with X's. This is done so
that when Soop is ran, it will ignore the repeative seuqnce when generating probes.
select_db_probes.pl This script will take a database generated file (dbfile in xml v1 format)
and a probe file (xml v2 format) and will create a xml file that contains
only the probes from the database. The xml file will be in v2 format,
and will contain all the information from the probe file.
select_probes.pl This script will take a species, target and bp range and return all the
probes in that range. Currently everything is hard coded.
soop.pl This is the orginal SOOP script to generate probes, prior to XML output.
soop_target.sh This script will go through targets and generate probes from
all the LAT output files (*.verbose).
soop_v2.pl This is the orginal SOOP script to generate probes, prior to XML output.
Use of ungapped alignments is improved in v2 (version 2).
sort_invalid_probes.sh This script will take all the invalid probe files and combine
them using combine.pl
sort_valid_probes.sh This script will take all the valid probe files and combine
them using combine.pl
summarize_results.sh This script will calculate the percent coverage of probes
thomas_spliter.pl
truncate_fasta_header.pl This script will take a fast a file and make sure each header is unique.
And will remove all text after the first word.
unique_fasta_header.pl This script will take a fast a file and make sure each header is unique.
This is done by appending a number before the description. The sequence
in the fasta is not changed.
All Scripts Nsoop Scripts
Build 1 Scripts
Build 2 Scripts
Miscellaneous Scripts
Top
Script Description
add_results_to_hybridization.pl This script will take a html formated file containing probe results
and add the results to the hybridization file.
analyse_hybridization.pl This script takes a probe summary text file, and can compute several
different types calculations, such as melting temp, number mismatches,
and longest identical string
analyse_xml_probes.pl This script takes a XML hybridization text file, and can compute several
different types calculations, such as melting temp, number mismatches,
and longest identical string
blastz.sh This script will go through all Targets in a given directory
and will blastz the human and species sequences together
blastz_index.sh This script is used for the chicken generation. It will go in to a target
and will single converage the blastz file, and create a blastz index file.
combine.pl This script will take a XML probe document, and will sort the
probes based upon start bp location.
combine_runs.sh This script will combine the 3 probe generation runs in to
XML file. This is done on both the valid and invalid files
convert_axt.pl
count_probes.sh This script is used to count the number of probes through
the probe generation process to make sure a probe is not
dropped during processing
create_consolidated_megablast_files.sh This file takes all *.sorted files and calls create_megeblast_file.pl
create_megablast_file.pl From one XML probe file prepare one or more FASTA format files ready for megablast. Each FASTA file will contain maximum of N probes (default 3000).
create_megablast_files.sh Run create_megablast_file.pl once for each input file in current directory.
create_megablast_files_invalid_sorted.sh This script will create megablast ready files for all *.invalid.sorted files
create_megablast_files_valid_sorted.sh This script will create megablast ready files for all *.valid.sorted
create_probe_summary.pl This file takes an html file and creates a probe_summary.txt file
fix_blastz.sh Some blastz files had ^M characters that needed to be removed
This script removes the characters.
fix_reprocess.sh This script will create a recreate the initial reprocess file for
archive purposes.
fix_sorted_xml.sh This script fixes the lack of an XML header and footer for *.sorted
by adding them both
fix_xml_files.sh This script fixes the lack of the XML header and footer for *.xml
files by adding them.
hybridize.pl This script will preform a computation hybridization with a given XML file
index_blastz.pl This will create an index to the blsatz file for quick searching.
The index will be printed to stdout.
lat.sh This script passes the args to the lat.jar program
lat_all.sh This script will traverse through all TargetX directories and
lat all the single coverage blastz files
load_probes.pl Insert probes from file into database.
locate_bp.pl This script will take a XML file of probes, and look up the location in the human
the location of the probes.
multi_soop.pl Generate probes from a multi-species alignment file (.maf) using scoring parameters from configuration file.
name_probes.pl This script will assign a "temp" name to each probe in the XML file based
upon the base name given on the command line.
probe_count.sh This script will count the number of probes in each mismatch group (0-4)
in an XML file.
process_megablast_output.sh This script takes the mega_blast output, sorts it, and then validates
the results to make sure that the probes are unique
(using validate_megablast_output.pl)
qsub_generate_probes.pl Run multi_soop.pl once for each input file in current directory.
qsub_megablast.pl Run mega_blast once for each input file in current directory.
qsub_megablast_fix.pl
qsub_megablast_validation.pl For each input file in current directory, sort the file then run validate_megablast_output.pl
qsub_separate_probes.pl Run separate_all_probes.pl once for each chromosome represented in XML input files.
qsub_seperate_probes.pl
rank_probes.pl This script will count the number of probes for each species category.
reprocess.sh This script will read the reprocess files and generate new probes
from the areas that generated non-unique probes.
reprocess_genome.sh This script is used for reprocessing, it will take all the *.mini
files and generate probes from them.
scrubber.pl This program will take either a Fasta formated file, or a PipMaker output file,
and will replace the repeative sequence (lower case) with X's. This is done so
that when Soop is ran, it will ignore the repeative seuqnce when generating probes.
select_db_probes.pl This script will take a database generated file (dbfile in xml v1 format)
and a probe file (xml v2 format) and will create a xml file that contains
only the probes from the database. The xml file will be in v2 format,
and will contain all the information from the probe file.
select_probes.pl This script will take a species, target and bp range and return all the
probes in that range. Currently everything is hard coded.
separate_all_probes.pl For one chromosome, match validation results with full XML probe information to produce two output file, one limited to unique probes and the other to non-unique probes.
separate_probes.pl This script will take an XML probe file and based upon the results
from megablast, will seperate the probes in to a XML.VALID or
XML.INVALID
soop.pl This is the orginal SOOP script to generate probes, prior to XML output.
soop_target.sh This script will go through targets and generate probes from
all the LAT output files (*.verbose).
soop_v2.pl This is the orginal SOOP script to generate probes, prior to XML output.
Use of ungapped alignments is improved in v2 (version 2).
sooper_xml.pl This script will take standard input formats, along with AXTE.
It will generate probes in XML format.
sooper_xml_v2.pl This script will take standard input formats, along with AXTE.
It will generate probes in XML format.
Use of ungapped alignments is improved in v2 (version 2).
sort_invalid_probes.sh This script will take all the invalid probe files and combine
them using combine.pl
sort_valid_probes.sh This script will take all the valid probe files and combine
them using combine.pl
summarize_results.sh This script will calculate the percent coverage of probes
thomas_spliter.pl
truncate_fasta_header.pl This script will take a fast a file and make sure each header is unique.
And will remove all text after the first word.
unique_fasta_header.pl This script will take a fast a file and make sure each header is unique.
This is done by appending a number before the description. The sequence
in the fasta is not changed.
validate_megablast_output.pl Evaluate a file of sorted megablast output to characterize each probe as unique or non-unique.