Welcome, guest | Sign In | My Account | Store | Cart

Notice! PyPM is being replaced with the ActiveState Platform, which enhances PyPM’s build and deploy capabilities. Create your free Platform account to download ActivePython or customize Python with the packages you require and get automatic updates.

Download
ActivePython
INSTALL>
pypm install metadown

How to install metadown

  1. Download and install ActivePython
  2. Open Command Prompt
  3. Type pypm install metadown
 Python 2.7Python 3.2Python 3.3
Windows (32-bit)
Windows (64-bit)
Mac OS X (10.5+)
Linux (32-bit)
Linux (64-bit)
0.6 Available View build log
 
Author
License
GPLv3
Lastest release
version 0.6 on Jan 9th, 2014

A programatic collector/downloader for IOOS like 19115-2 metadata written in Python.

Services supported:

## Installation

metadown is available on pypi and is easiest installed using pip.

`bash pip install metadown ` lxml, requests, and thredds_crawler will be installed automatically

## Usage

### THREDDS

The ThreddsCollector can take two optional arguments. Both are strongly suggested so you don't crawl an entire THREDDS server (unless that is what you want to do).

  • selects (list) - Select datasets based on their THREDDS ID. Python regex is supported.
  • skips (list) - Skip datasets based on their name and catalogRefs based on their xlink:title. By default, the crawler uses four regular expressions to skip lists of thousands upon thousands of individual files that are part of aggregations or FMRCs (they are below.) Setting the skip parameter to anything other than a superset of the defaults below runs the risk of having some angry system admins after you.
    • .*files/
    • .*Individual Files.*
    • .*File_Access.*
    • .*Forecast Model Run.*

```python from metadown.collectors.thredds import ThreddsCollector

System Message: WARNING/2 (<string>, line 35); backlink

Inline literal start-string without end-string.

System Message: WARNING/2 (<string>, line 35); backlink

Inline interpreted text or phrase reference start-string without end-string.

selects = [".*SST-Agg"] tc = ThreddsCollector("http://tds.glos.us:8080/thredds/mtri/aoc.html", selects=selects) metadata_urls = tc.run() print metadata_urls [

System Message: ERROR/3 (<string>, line 43)

Unexpected indentation.
'http://tds.glos.us:8080/thredds/iso/SST/LakeErieSST-Agg', 'http://tds.glos.us:8080/thredds/iso/SST/LakeHuronSST-Agg', 'http://tds.glos.us:8080/thredds/iso/SST/LakeMichiganSST-Agg', 'http://tds.glos.us:8080/thredds/iso/SST/LakeOntarioSST-Agg', 'http://tds.glos.us:8080/thredds/iso/SST/LakeSuperiorSST-Agg'

System Message: WARNING/2 (<string>, line 48)

Block quote ends without a blank line; unexpected unindent.

]

### GeoNetwork

```python from metadown.collectors.geonetwork import GeoNetworkCollector

System Message: WARNING/2 (<string>, line 53); backlink

Inline literal start-string without end-string.

System Message: WARNING/2 (<string>, line 53); backlink

Inline interpreted text or phrase reference start-string without end-string.

gnc = GeoNetworkCollector("http://data.glos.us/metadata") metadata_urls = gnc.run() print metadata_urls [

System Message: ERROR/3 (<string>, line 60)

Unexpected indentation.
... 'http://data.glos.us/metadata/srv/en/iso19139.xml?id=39848', 'http://data.glos.us/metadata/srv/en/iso19139.xml?id=39846', 'http://data.glos.us/metadata/srv/en/iso19139.xml?id=39845'

System Message: WARNING/2 (<string>, line 64)

Block quote ends without a blank line; unexpected unindent.

]

### Downloading resulting ISO files

Once you have the metadata urls for the data you want, do whatever you want! If you would like to rename or modify the metadata files, there is a helper class called XmlDownloader

XmlDownloader takes in three parameters:

  • url_list (required) - a list of URLs

  • download_path (required) - folder to download files to on your local machine

  • namer (optional) - a python function for renaming the metadata files before saving them to your local machine. It should take in a single url and return a string filename for the url to be saved as.

    Example namer function that renames GeoNetwork URLs ```python from urlparse import urlsplit def geonetwork_renamer(url, **kwargs):

    System Message: WARNING/2 (<string>, line 79); backlink

    Inline literal start-string without end-string.

    System Message: WARNING/2 (<string>, line 79); backlink

    Inline interpreted text or phrase reference start-string without end-string.

    System Message: WARNING/2 (<string>, line 79); backlink

    Inline strong start-string without end-string.

    System Message: ERROR/3 (<string>, line 83)

    Unexpected indentation.

    uid = urlsplit(url).query uid = uid[uid.index("=")+1:] return "GeoNetwork-" + uid + ".xml"

    System Message: WARNING/2 (<string>, line 86)

    Block quote ends without a blank line; unexpected unindent.

    ```

    System Message: WARNING/2 (<string>, line 86); backlink

    Inline literal start-string without end-string.

    System Message: WARNING/2 (<string>, line 86); backlink

    Inline interpreted text or phrase reference start-string without end-string.

  • modifier (optional) - a python function for full control over the metadata content. It should take in a single url and return a str representation of ISO19115-2.

    Example modifier function that translates GeoNetwork's ISO19139 to ISO19115-2 ```python from metadown.utils.etree import etree def geonetwork_modifier(url, **kwargs):

    System Message: WARNING/2 (<string>, line 90); backlink

    Inline literal start-string without end-string.

    System Message: WARNING/2 (<string>, line 90); backlink

    Inline interpreted text or phrase reference start-string without end-string.

    System Message: WARNING/2 (<string>, line 90); backlink

    Inline strong start-string without end-string.

    System Message: ERROR/3 (<string>, line 94)

    Unexpected indentation.

    gmi_ns = "http://www.isotc211.org/2005/gmi" etree.register_namespace("gmi",gmi_ns) new_root = etree.Element("{%s}MI_Metadata" % gmi_ns) old_root = etree.parse(url).getroot() # carry over an attributes we need [new_root.set(k,v) for k,v in old_root.attrib.items()] # carry over children [new_root.append(e) for e in old_root] return etree.tostring(new_root, encoding="UTF-8", pretty_print=True, xml_declaration=True)

    System Message: WARNING/2 (<string>, line 103)

    Block quote ends without a blank line; unexpected unindent.

    ```

    System Message: WARNING/2 (<string>, line 103); backlink

    Inline literal start-string without end-string.

    System Message: WARNING/2 (<string>, line 103); backlink

    Inline interpreted text or phrase reference start-string without end-string.

`python from metadown.downloader import XmlDownloader XmlDownloader.run(metadata_urls, download_directory, namer=geonetwork_renamer, modifier=geonetwork_modifier) `

Subscribe to package updates

Last updated Jan 9th, 2014

What does the lock icon mean?

Builds marked with a lock icon are only available via PyPM to users with a current ActivePython Business Edition subscription.

Need custom builds or support?

ActivePython Enterprise Edition guarantees priority access to technical support, indemnification, expert consulting and quality-assured language builds.

Plan on re-distributing ActivePython?

Get re-distribution rights and eliminate legal risks with ActivePython OEM Edition.