perl - Emboss Cons for getting consensus sequence for many files, not just one -
i installed , configured emboss , can run simple command line arguments getting consensus of 1 aligned multifasta file:
% cons
create consensus sequence multiple alignment
input (aligned) sequence set: dna.msf
output sequence [dna.fasta]: aligned.cons
this perfect dealing 1 file @ time, have hundreds process. have started write perl script foreach loop try , process every file, guess need outside of script run these commands. clue on how can run command line friendly program getting single consensus sequence in fasta format aligned multifasta file, many files in succession? don't have use emboss- use program. here code far:
#!/usr/bin/perl use warnings; use strict; $dir = ("/users/roblogan/documents/clustered_barcodes_aligned"); @arrayoffiles = glob "$dir/*"; #put files in directory array #print join("\n", @arrayoffiles), "\n"; #diagnostic print foreach $file (@arrayoffiles){ print 'cons', "\n"; print "/users/roblogan/documents/clustered_barcodes_aligned/clustered_barcode_number_*.*.sequences.txt.out", "\n"; print "*.*.consensus.txt", "\n"; }
emboss cons has 2 mandatory qualifier:
- sequence( provide input sequence)
- outseq (for output).
so need provide above field .
now change code little bit run multiple program:
my $count=1; foreach $file (@arrayoffiles){ $output_path= "/users/roblogan/documents/clustered_barcodes_aligned/"; $output_file = $output_path. "out$count";# please change here desired output filename $command = "cons -sequence '$file' -outseq '$output_file' "; system($command); $count ++; }
hope above code work you.
Comments
Post a Comment