[Zinc-fans] ZINC: downloadable purchasing info?

John J. Irwin jji at cgl.ucsf.edu
Tue Sep 15 21:39:08 PDT 2009


Hi Marc


marc wrote:
> Hi John (and other ZINC fans),
>
> Quick question that is hopefully simple:
>
> When downloading the purchasing information for the "all chemical set", 
> the file format appears to be XLS. 
>
> i.e.       http://zinc.docking.org/subset1/6/6_purch.xls
>
> Since Excel doesn't allow you to view past ~65000 records, doesn't that 
> make XLS a bad file format
> to use for a collection that is ~13million compounds?.   To my 
> knowledge, XLS isn't a text-based
> file format, making it not readable by text editors (or perl scripts, 
> etc), thus you're stuck with Excel
> (or maybe Open Office).
>   
1. Excel 2007 (and later) does not have the 64K row restriction.
2. It is a tab delimited file with an xls extension, so you can just use
grep/perl/whatever.
> I.e. wouldn't a better file format be like "csv.gz" or "csv.bz"?   
> (which can be manipulated easier in a text
> editor, searched with "Grep", etc).    
>
> Thanks for any clarification!
>
> This actually makes me wonder how you were able to create an XLS file 
> containing data for  ~13 million compounds?
>   
as above, it is a tab-delimited text file.
> -marc
>
> _______________________________________________
> Zinc-fans mailing list
> Zinc-fans at docking.org
> http://blur.compbio.ucsf.edu/mailman/listinfo/zinc-fans
>   


More information about the Zinc-fans mailing list