﻿id	summary	reporter	owner	description	type	status	priority	milestone	component	version	resolution	keywords	cc	blockedby	blocking	notify_on_close	platform	project
8041	Add ESMFold database to BlastProtein	Tom Goddard	Zach Pearson	"The ESMFold metagenomic atlas is very similar to the AlphaFold database.  I would like the blast protein command and tool to be able to search this database.  I have created the blast database on plato at

    /wynton/group/ferrin/databases/mol/ESMFold/v1

This is 600 million sequences, so considerably larger than the AlphaFold database which is about 200 million sequences.  I think we are using 4 threads in our blast searches now and it would be good to increase to 8 threads.

The ESMFold sequence titles contain just the MGnify sequence database identifier, for example MGYP000000000040 in the following entry

{{{
>MGYP000000000040                                                                                                            
MCGVYQSATFQATFFQYSYILHETLADIVVPDTIGGKIRKLRHSLNLAQMQFAKSIHRGFTTVTKWEQELTTTSEKALTNIIEIYKLQENYFDK     
}}}

So the Blast tool output table of search results will be very simple, no species or descriptive name, just the identifier.  The esmfold fetch command can fetch the associated predicted structure.  I can help with that code.

"	enhancement	closed	moderate		Sequence		fixed				7970		all	ChimeraX
