﻿id	summary	reporter	owner	description	type	status	priority	milestone	component	version	resolution	keywords	cc	blockedby	blocking	notify_on_close	platform	project
5216	need better parsing of BlastProtein AlphaFold results	Elaine Meng	Zach Pearson	"Often some of the column values are blank but I believe that to be a bug because the Description field contains many of the other fields' values so I can see that the values do exist.  Also sometimes the Chain Sequence ID is the gene name and sometimes it is not, which I also suspect to be a bug.  Both of these problems could well be due to the inconsistent format of the database, but suggests we need to posprocess the Description field to set them right.  

For example ""alphafold search ldlr_human"" see results in screenshot. (I wish I could copy text and paste, but it does not allow that).

Description field contains:
Name Title OS=Species OX=L GN=text PE=N SV=M

Many of the hits show a title and species in the Description field but have blank Title and Species columns.

GN is apparently gene name.  No idea what OX, PE, and SV are.  At first I thought OX was length but I can see it is way too high for that.  (Too bad, it would be useful if there were a column for length.)

Currently the Blast output dialog has inconsistent values in the Chain Sequence ID field.  Sometimes it is gene name, sometimes it is the part of the Name (UniProt Name) preceding the underscore, sometimes blank.  Maybe we should get rid of Chain Sequence ID and add a Gene Name column with the GN value from the Description.  It usually looks reasonable but is occasionally weird, e.g. for hit highlighted in the second screen shot. "	defect	closed	moderate	1.3	Sequence		fixed		Tom Goddard				all	ChimeraX
