1. Lars Yencken
  2. drakeutil

Commits

Lars Yencken  committed bb53325

Basic utilities for drake.

  • Participants
  • Branches master

Comments (0)

Files changed (3)

File .gitignore

View file
  • Ignore whitespace
+.DS_Store
+*.pyc
+*.pyo
+*.orig
+*.bak

File README.md

View file
  • Ignore whitespace
+# drakeutil
+
+Utilities for making life easier in Python with [Drake](https://github.com/Factual/drake) workflows.
+
+## Installing
+
+Run `pip install drakeutil`, then for Python steps inside your workflow include:
+
+```
+somefile.out <- somefile.in [python]
+    from drakeutil import *
+```

File drakeutil/__init__.py

View file
  • Ignore whitespace
+# -*- coding: utf-8 -*-
+#
+#  __init__.py
+#  drakeutil
+#
+
+"""
+Helpers for Drake workflows.
+"""
+
+from shutil import copy, move, copytree, rmtree  # noqa
+from os import path, rename, stat  # noqa
+from datetime_tz import datetime_tz
+
+import subprocess
+
+
+def hdfs_timestamp(filename):
+    "When was this file last modified?"
+    stdout, stderr = subprocess.Popen(['hadoop', 'fs', '-stat', filename],
+            stdout=subprocess.PIPE, stderr=subprocess.PIPE).communicate()
+    if not stdout and 'No such' in stderr:
+        return None
+
+    return datetime_tz.smartparse(stdout)
+
+
+def hdfs_exists(filename):
+    "Does this file or directory exist?"
+    p = subprocess.Popen(['hadoop', 'fs', '-test', '-e', filename])
+    p.communicate()
+    return p.returncode == 0
+
+
+def file_timestamp(filename):
+    "When was this file last modified?"
+    try:
+        return datetime_tz.utcfromtimestamp(stat(filename).st_mtime)
+    except OSError:
+        return