[Zinc-fans] Query search and downloading issues

John J. Irwin jji at cgl.ucsf.edu
Mon Mar 9 12:32:51 PDT 2009


Hi Sergio

Thanks for your email.

Sergio Mares-Samano wrote:
> Dear Zinc Fans,
>
> I have experienced some difficulties using the ZINC database:
>
>     * It is not quite clear to me how to perform a search for similar
>       compounds when the input is a basic structures with different
>       substructures using the smiles feature. For example,  I'm trying
>       downloading similar compounds to X-a, X-b and X-c, where X is
>       the basic structure and a,b and c are substructures. Firstly, I
>       specify the smiles of each compound and save them using the
>       "save smiles" option. However, it seems that this strategy leads
>       to getting similar compound to Xabc. So, How can I conduct a
>       search for similar compounds to X-a AND X-b AND X-c using the
>       smiles feature? Which similarity algorithm is used?
>
The similarity search feature in ZINC is very basic, and has a number of
important limitations.  It is designed mainly to give you a chance to
find a very small number of molecules.  If you really want to have full
control to pick particular compounds, may I suggest you download the
entire database and search it on your local machine.

That said, if you want to search for compounds containing more than one
functional group, you join them together with a dot.  This is the
standard SMARTS way.  For instance, to look for a benzylsulfonamide and
a pyridine in the same molecule, the pattern to use would be
NS(=O)(=O)Cc1ccccc1.n1ccccc1

In the future, we hope to make the search feature more robust.  For now,
it just gives you a way to take a glimpse inside ZINC.
>
>     * After performing a query search, a set of 3500 molecules is
>       displayed. The total number of molecules included in the set was
>       verified by downloading the "table". However, when trying to
>       download the sdf file, only 692 molecules are actually
>       downloaded. How can I download the full molecule set?
>
The search facility is really only designed to retrieve small numbers of
molecules.  For working with large sets, we recommend downloading a
subset that covers everything you want, and then selecting from among
those on your local machine.


Good docking

John
UCSF ZINC Team

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://blur.compbio.ucsf.edu/pipermail/zinc-fans/attachments/20090309/faf95744/attachment.html 


More information about the Zinc-fans mailing list