Subject: Re: rename resonances in batch mode?
From: liwenfu24
Date: Sep 3, 2009

Previous: 600

The following two scripts work better.

1. gen_rr_list.pl (generate rr.list file)
=====================================================================
#!/usr/bin/perl

if ($#ARGV != 2) { die
Usage: $0 input_file(prot.seq) 1st_resid(e.g.,1) output_file(rr.list)
Notes:
input_file(prot.seq) the output of formatting_sequences.pl;
1st_resid(e.g.,1) the residue ID of the first residue;
output_file(rr.list) the output file containing the rename resonance list.nn;exit;}

$input_file_seq = $ARGV[0];
$resnum_1st_seq = $ARGV[1];
$output_file_list = $ARGV[2];
open(filehandle_inp_seq,$input_file_seq) ||
die (ncannot open the input file: $input_file_seq nn);
open(filehandle_out_list,$output_file_list) or
die (ncannot open the output file: $output_file_list nn);

###################################################################

$resnum_seq = $resnum_1st_seq - 1;

while ($readline_seq = filehandle_inp_seq)
{

chomp $readline_seq;

@array_seq_temp = split(//, $readline_seq);

# place the sequence into @array_seq and @array.
for ($i = 0; $i = $#array_seq_temp; $i++){

if (index(ACcDEFGHIKLMNPQRSTVWY,@array_seq_temp[$i]) -1) {
$resnum_seq++;
@array_seq[$resnum_seq] = @array_seq_temp[$i];
$array[$resnum_seq][1] = $resnum_seq; # resnum
$array[$resnum_seq][2] = @array_seq_temp[$i]; # resname
# resid = resname + resnum
}

}

} # while

###################################################################

# e.g., Q260 CA-1 S259 CA
for ($m = $resnum_1st_seq+1; $m = $resnum_seq ; $m++){
$resid_curr = $array[$m][2].$array[$m][1];
$resid_prev = $array[$m-1][2].$array[$m-1][1];
printf %s %s %s %sn, $resid_curr, CA-1, $resid_prev, CA;
printf filehandle_out_list %s %s %s %sn, $resid_curr, CA-1, $resid_prev, CA;

$resid_curr = ;
$resid_prev = ;
}

# e.g., Q260 CB-1 S259 CB
for ($m = $resnum_1st_seq+1; $m = $resnum_seq ; $m++){
$resid_curr = $array[$m][2].$array[$m][1];
$resid_prev = $array[$m-1][2].$array[$m-1][1];
printf %s %s %s %sn, $resid_curr, CB-1, $resid_prev, CB;
printf filehandle_out_list %s %s %s %sn, $resid_curr, CB-1, $resid_prev, CB;

$resid_curr = ;
$resid_prev = ;
}

# e.g., Q260 C-1 S259 C
for ($m = $resnum_1st_seq+1; $m = $resnum_seq ; $m++){
$resid_curr = $array[$m][2].$array[$m][1];
$resid_prev = $array[$m-1][2].$array[$m-1][1];
printf %s %s %s %sn, $resid_curr, C-1, $resid_prev, C;
printf filehandle_out_list %s %s %s %sn, $resid_curr, C-1, $resid_prev, C;

$resid_curr = ;
$resid_prev = ;
}

=====================================================================
2. rename_resonances.csh (rename resonances)
=====================================================================
#!/bin/csh

if ($#argv != 2) then
echo
echo Usage: $0 spectrum.save rr.list
echo
exit
endif

set save = $1
set list = $2

rm -f ${save}_temp
cp $save ${save}_temp

# line starting with rs
awk -v spectrum=${save}_temp {
system(sed s/|$1|$2|/|$3|$4|/g spectrum tempfile )
system(mv tempfile spectrum )
} $list
# e.g., sed s/|Q260|CA-1|/|S259|CA|/g cbcaconh_600mhz_20090619.save

#line starting with label
awk -v spectrum=${save}_temp {
system(sed s/$1N-$2-H/$1N-$3$4-$1H/g spectrum tempfile )
system(mv tempfile spectrum )
} $list
# e.g., sed s/Q260N-CA-1-H/Q260N-S259CA-Q260H/g cbcaconh_600mhz_20090619.save

rm -f ${save}_ok
mv ${save}_temp ${save}_ok

# format of the rr.list file
############################
#Q260 CA-1 S259 CA
############################

=====================================================================

--- In nmr_sparky@yahoogroups.com , liwenfu24 liwenfu24@... wrote:

Following your opinion, I wrote the following script.
============================================================
#!/bin/csh

if ($#argv != 2) then
echo
echo Usages: $0 spectrum.save rr.list
echo
exit
endif

set save = $1
set list = $2

# line starting with rs
awk -v spectrum=$save {
system(sed s/|$1|$2|/|$3|$4|/g spectrum)
} $list ${save}_temp
# e.g., sed s/|Q260|CA-1|/|S259|CA| cbcaconh_600mhz_20090619.save

#line starting with label
awk -v spectrum=${save}_temp {
system(sed s/$1N-$2-H/$1N-$3$4-$1H/g spectrum)
} $list ${save}_ok
# e.g., sed s/Q260N-CA-1-H/Q260N-S259CA-Q260H/g cbcaconh_600mhz_20090619.save

# format of the rr.list file
############################
#Q260 CA-1 S259 CA
############################
============================================================

It works fine!

Thanks again!

Liwen

--- In nmr_sparky@yahoogroups.com , Tom Goddard goddard@ wrote:

Hi Liwen,

Sparky has no batch atom / residue renaming capability. The rename
resonance dialog (rr) can rename one residue/atom in all peak
assignments. But renaming many different atoms and residues at once is
not available.

The Python interface to Sparky does not have functions to change atom
and residue names, nor does it allow reassignment of peaks. So writing
a Sparky Python script to do this is not feasible. Those operations are
done in C++ which can be hard to modify.

One approach you could consider is to edit your Sparky spectrum
session files with a text editor capable of doing batch search and
replace operations. I never used an editor to replace hundreds of
different text patterns at once so I dont know if any editors can do
that. For the replacements that just drop the -1 (CA-1 to CA) you
might be able to handle all those with a single search/replace. The
assignment lines in the session files look like:

rs |L46|CA| |R47|N| |R47|HN|

To avoid replacing text that is not really an assignment you might
include the vertical bars | in your search and replace. Peak labels
also have the assignment text. You could try to edit those, or maybe
just delete all the labels in Sparky and recreate them once the
assignment names have been changed.

Of course make backup copies of your session files if you try any of this.

Tom

-------- Original Message --------
Subject: [nmr_sparky] rename resonances in batch mode?
From: liwenfu24
To: nmr_sparky
Date: 9/2/09 5:02 AM

Dear Dr. Goddard,

I am going to perform the side chain assignment of a protein with 230
residues.

After the backbone assignment, there are atom names like CA-1, CB-1,
C-1 mixed with CA, CB, C.
So, I want to rename Q2 CA-1 to M1 CA and something like that. The
accelerator rr can do that.
But it is error prone to rename all these atom names one by one (230 x
3 =~ 1000) .

I am wondering that is it possible to replace all the atom names at once.
For example, we can create a list (4 columns or 2 columns)
And then tell Sparky to read the list and replace all the atom names;

----------------
4 columns
----------------
Q2 CA-1 M1 CA
T3 CA-1 Q2 CA
----------------

----------------
2 columns
----------------
001 M1
002 Q2
----------------

If the atom name to be renamed does not exist, generate a warning file
(log).

Thanks in advance!

Best regards,

Liwen Fu