[Zinc-fans] Inconsistencies in all-purchasable SMILES file

John Irwin jji at cgl.ucsf.edu
Wed Sep 14 23:54:07 PDT 2005

Hi Alberto
Thanks for your email and report of issues in ZINC. There are indeed some
bookkeeping problems in ZINC. We had thought we had fixed most of them, but
apparently there are still a few in there. I'll put it on my list. The
forthcoming version of ZINC will have numerous changes, one of which is (we
hope!) higher quality control to prevent these sort of problems.


From: zinc-fans-bounces at docking.org [mailto:zinc-fans-bounces at docking.org]
On Behalf Of nspmma at freenet.de
Sent: Wednesday, September 14, 2005 4:33 PM
To: zinc-fans at docking.org
Subject: [Zinc-fans] Inconsistencies in all-purchasable SMILES file

First of all let me thank you for making this great dataset available. I
think it will be extremly usefull.

I have downloaded the two smiles files for the all-purchasable dataset but I
found some inconsistencies in
the file with the multiple representations per compound
- Some of the smiles have incorrect double bond stereo specification useing
a double back-slash instead of a single back-slash: eg.
   CC1CCN(CC1)C(=O)/C(=Cc2ccccc2)/NC(=O)c3ccccc3 ZINC00045169

- For at least one case I found that the ZINC number is not unique to a
   CC(=O)/C(=C/N(C)C)/c1[nH ]cccc1[N ](=O)[O-]     ZINC00118035
   CCOC(=O)/C(=CNc1ccc(cc1)F)/C#N  ZINC00118035
   CCOC(=O)/C(=CNc1ccc(cc1)F)/C#N        ZINC00118035
   Here the first is clearly not an isomer of the other two.

Did i overlook something, or is there an other problem?



-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://blur.compbio.ucsf.edu/pipermail/zinc-fans/attachments/20050914/97c51d0c/attachment-0001.html

More information about the Zinc-fans mailing list