Welcome, guest | Sign In | My Account | Store | Cart

Notice! PyPM is being replaced with the ActiveState Platform, which enhances PyPM’s build and deploy capabilities. Create your free Platform account to download ActivePython or customize Python with the packages you require and get automatic updates.

Download
ActivePython
INSTALL>
pypm install gridd

How to install gridd

  1. Download and install ActivePython
  2. Open Command Prompt
  3. Type pypm install gridd
 Python 2.7Python 3.2Python 3.3
Windows (32-bit)
0.0.6 Available View build log
Windows (64-bit)
0.0.6 Available View build log
Mac OS X (10.5+)
0.0.6 Available View build log
Linux (32-bit)
0.0.6 Available View build log
Linux (64-bit)
0.0.6 Available View build log
 
License
MIT
Lastest release
version 0.0.6 on Feb 25th, 2012

Grammar-based Reconstruction of Information-Dense Data tables.

Library for extracting schema information from data tables

Sample usage

Use gridd to extract data from a table in XLS or HTML format and output it (as CSV by default).

> gridd extract file.xls
Category,Country,Residents,Applications
North America,United States,30700700,224912
North America,Canada,33739900,5067
North America,Mexico,112033369,230801
Asia,Japan,127557958,295315
Asia,China,1331380000,229096
Asia,South Korea,48747000,127316

You can choose your output format (JSON provides more schema info):

> gridd extract -o json file.xls

Or ask for more verbose output:

> gridd extract -v file.xls

Several extraction methods are built-in. By default, the parser method is used, but the bayes and webtables methods are available. Support for additional methods is planned.

> gridd extract -m webtables file.xls

Use predefined external sets of values to improve extraction accuracy.

> gridd extract --use-sets file.xls

Train the gridd classifier using custom annotations.

> gridd train -a annotations.txt file1.xls file2.xls file3.xls
...
Successfully trained using 3 files.
Model parameters stored in training.json

Run a web interface that shows both the raw data table and the extracted data table.

> gridd web file.xls
 * Running on http://0.0.0.0:5000/

Subscribe to package updates

Last updated Feb 25th, 2012

Download Stats

Last month:1

What does the lock icon mean?

Builds marked with a lock icon are only available via PyPM to users with a current ActivePython Business Edition subscription.

Need custom builds or support?

ActivePython Enterprise Edition guarantees priority access to technical support, indemnification, expert consulting and quality-assured language builds.

Plan on re-distributing ActivePython?

Get re-distribution rights and eliminate legal risks with ActivePython OEM Edition.