详细的使用说明:http://bedtools.readthedocs.org/en/latest/

Collectively, the bedtools utilities are a swiss-army knife of tools for a wide-range of genomics analysis tasks. The most widely-used tools enable genome arithmetic: that is, set theory on the genome. For example, bedtools allows one to intersectmergecountcomplement, and shuffle genomic intervals from multiple files in widely-used genomic file formats such as BAM, BED, GFF/GTF, VCF. While each individual tool is designed to do a relatively simple task (e.g., intersect two interval files), quite sophisticated analyses can be conducted by combining multiple bedtools operations on the UNIX command line.

Summary of available tools.

bedtools support a wide range of operations for interrogating and manipulating genomic features. The table below summarizes the tools available in the suite.

annotateAnnotate coverage of features from multiple files.
bamtobedConvert BAM alignments to BED (& other) formats.
bamtofastqConvert BAM records to FASTQ records.
bed12tobed6Breaks BED12 intervals into discrete BED6 intervals.
bedpetobamConvert BEDPE intervals to BAM records.
bedtobamConvert intervals to BAM records.
closestFind the closest, potentially non-overlapping interval.
clusterCluster (but don’t merge) overlapping/nearby intervals.
complementExtract intervals _not_ represented by an interval file.
coverageCompute the coverage over defined intervals.
expandReplicate lines based on lists of values in columns.
flankCreate new intervals from the flanks of existing intervals.
genomecovCompute the coverage over an entire genome.
getfastaUse intervals to extract sequences from a FASTA file.
groupbyGroup by common cols. & summarize oth. cols. (~ SQL “groupBy”)
igvCreate an IGV snapshot batch script.
intersectFind overlapping intervals in various ways.
jaccardCalculate the Jaccard statistic b/w two sets of intervals.
linksCreate a HTML page of links to UCSC locations.
makewindowsMake interval “windows” across a genome.
mapApply a function to a column for each overlapping interval.
maskfastaUse intervals to mask sequences from a FASTA file.
mergeCombine overlapping/nearby intervals into a single interval.
multicovCounts coverage from multiple BAMs at specific intervals.
multiinterIdentifies common intervals among multiple interval files.
nucProfile the nucleotide content of intervals in a FASTA file.
overlapComputes the amount of overlap from two intervals.
pairtobedFind pairs that overlap intervals in various ways.
pairtopairFind pairs that overlap other pairs in various ways.
randomGenerate random intervals in a genome.
reldistCalculate the distribution of relative distances b/w two files.
shuffleRandomly redistribute intervals in a genome.
slopAdjust the size of intervals.
sortOrder the intervals in a file.
subtractRemove intervals based on overlaps b/w two files.
tagTag BAM alignments based on overlaps with interval files.
unionbedgCombines coverage intervals from multiple BEDGRAPH files.
window

Find overlapping intervals within a window around an interval.

安装: yum install BEDTools

1, 将bam文件(tophat得到的结果)转化为fastq

先将比对得到的accepted_hits.bam和unmapped.bam合并

samtools  merge RC6-1_ATTCCT_L005.bam accepted_hits.bam unmapped.bam

得到合并后的RC6-1_ATTCCT_L005.bam文件

将该bam文件按照reads名称排序:

samtools_0.1.18 sort -n RC6-1_ATTCCT_L005.bam RC6-1_ATTCCT_L005.sorted

得到RC6-1_ATTCCT_L005.sorted.bam文件

最后用bedtools转化

bedtools bamtofastq -i RC6-1_ATTCCT_L005.sorted.bam -fq RC6-1_ATTCCT_L005_R1.fastq -fq2 RC6-1_ATTCCT_L005_R2.fastq

得到双端的fastq文件。

05-02 06:36