JythonBook_KO / modules_packages.rst

Chapter 8: Modules and Packages

Up until this chapter we have been looking at code at the level of the interactive console and simple scripts. This works well for small examples, but when your program gets larger, it becomes necessary to break programs up into smaller units. In Jython, the basic building block for these units in larger programs is the module.

Imports For Re-Use

Breaking code up into modules helps to organize large code bases. Modules can be used to logically separate code that belongs together, making programs easier to understand. Modules are helpful for creating libraries that can be imported and used in different applications that share some functionality. Jython's standard library comes with a large number of modules that can be used in your programs right away.

Import Basics

The following is a very simple program that we can use to discuss imports:

breakfast.py

We'll start with a couple of definitions. A namespace is a logical grouping of unique identifiers. In other words, a namespace is that set of names that can be accessed from a given bit of code in your program. For example, if you open up a Jython prompt and type dir(), the names in the interpreter's namespace will be displayed.

The interpreter namespace contains __doc__ and __name__. The __doc__ property contains the top level docstring, which is empty in this case. We'll get to the __name__ property in a moment. First we need to talk about Jython modules. A module in Jython is a file containing Python definitions and statements which in turn define a namespace. The module name is the same as the file name with the suffix .py removed, so in our current example the Python file “breakfast.py” defines the module “breakfast”.

Now we can talk about the __name__ property. When a module is run directly, as in "jython breakfast.py", __name__ will contain '__main__'. If a module is imported, __name__ will contain the name of the module, so "import breakfast" results in the breakfast module containing a __name__ of "breakfast". Again from a basic Jython prompt:

Let's see what happens when we import breakfast:

Checking the dir() after the import shows that breakfast has been added to the top level namespace. Notice that the act of importing actually executed the code in breakfast.py. This is the expected behavior in Jython. When a module is imported, the statements in that module are actually executed. This includes class and function definitions. Most of the time, we wouldn't want a module to execute print statements when imported. To avoid this, but allow the code to execute when it is called directly, we typically check the __name__ property. If the __name__ property is '__main__', we know that the module was called directly instead of being imported from another module.

Now if we import breakfast, we will not get the output:

This is because in this case the __name__ property will contain 'breakfast', the name of the module. If we call breakfast.py from the commandline like "jython breakfast.py" we would then get the output again, because breakfast would be executing as __main__:

The Import Statement

In Java, the import statement is strictly a compiler directive that must occur at the top of the source file. In Jython, the import statement is an expression that can occur anywhere in the source file, and can even be conditionally executed.

As an example, a common idiom is to attempt to import something that may not be there in a try block, and in the except block import a module that is known to be there.

If a module named blah had existed, the definition of foo would have been taken

An Example Program

Here is the layout of a contrived but simple program that I will use to describe some aspects of importing in Jython.

The example contains one package: greet,which is a package because it is a directory containing the special __init__.py file. Note that the directory chapter7 itself is not a package because it does not contain an __init__.py. There are three modules in the example program: greetings, greet.hello and greet.people. The code for this program can be downloaded at XXX.

greetings.py

print "in greeting.py"
import greet.hello

g = greet.hello.Greeter()
g.hello_all()

greet/__init__.py

print "in greet/__init__.py"

greet/hello.py

print "in greet/hello.py"
import greet.people as people

class Greeter(object):
    def hello_all(self):
        for name in people.names:
            print "hello %s" % name

greet/people.py

print "in greet/people.py"

names = ["Josh", "Jim", "Victor", "Leo", "Frank"]

Trying out the Example Code

Types of import statements

The import statement comes in a variety of forms that allow much finer control over how importing brings named values into your current module.

I will discuss each of the import statement forms in turn starting with:

This most basic type of import imports a module directly. Unlike Java, this form of import binds the leftmost module name, so If you import a nested module like:

you need to refer to it as “greet.hello” and not just “hello” in your code.

The “as foo” part of the import allows you to re-label the “greet.hello” module as “foo” to make it more convenient to call. The example program uses this method to relabel “greet.hello” as “hello”. Note that it is not important that “hello” was the name of the subpackage except that it might aid in reading the code.

from import Statements

This form of import allows you to import modules, classes or functions nested in other modules. This allows you to import code like this:

In this case it is important that “hello” is actually a submodule of greet. This is not a re-labeling but actually gets the submodule named “hello” from the greet namespace. You can also use the from style of import to import all of the names in a module into your current module using a *. This form of import is discouraged in the Python community, and is particularly troublesome when importing from Java packages (in some cases it does not work, see chapter 10 for details) so you should avoid its use. It looks like this:

Relative import Statements

A new kind of import introduced in Python 2.5 is the explicit relative import. These import statements use dots to indicate how far back you will walk from the current nesting of modules, with one dot meaning the current module.

Even though this style of importing has just been introduced, its use is discouraged. Explicit relative imports are a reaction to the demand for implicit relative imports. If we had wanted to import the Greeter class out of greet.hello so that it could be instantiated with just Greeter() instead of greet.hello.Greeter we could have imported it like this:

If you wanted to import Greeter into the greet.people module, you could get away with:

This is a relative import. Since greet.people is a sibling module of greet.hello, the “greet” can be left out. This relative import style is deprecated and should not be used. Some developers like this style so that imports will survive module restructuring, but these relative imports can be error prone because of the possibility of name clashes. There is a new syntax that provides an explicit way to use relative imports, though they too are still discouraged. The import statement above would look like this:

Aliasing import Statements

Any of the above imports can add an "as" clause to import a module but give it a new name.

This gives you enormous flexibility in your imports, so to go back to the greet.hello example, you could issue:

And use foo in place of greet.hello.

Hiding Module Names

Typically when a module is imported, all of the names in the module are available to the importing module. There are a couple of ways to hide these names from importing modules. Starting any name with a double underscore (__) will mark names as private. The second way to hide module names is to define a list named __all__, which should contain only those names that you wish to have your module to expose. As an example here is the value of __all__ at the top of Jython's os module:

Note that you can add to __all__ inside of a module to expand the exposed names of that module. In fact, the os module in Jython does just this to conditionally expose names based on the operating system that Jython is running on.

Module Search Path, Compilation, and Loading

Java Import Example

public void hello() {
System.out.println("Hello World!");

} public void hello(String name) {

System.out.printf("Hello %s!", name);

}

Here we manipulate that class from the Jython interactive interpreter:

It's important to note that, because the HelloWorld program is located on the Java CLASSPATH, it did not go through the sys.path process we talked about before. In this case the Java class gets loaded directly by the ClassLoader. Discussions of Java ClassLoaders are beyond the scope of this book. To read more about ClassLoader see (citation? Perhaps point to the Java Language Specification section)

Module search Path and Loading

Understanding the process of module search and loading is more complicated in Jython than in either CPython or Java because Jython can search both Java's CLASSPATH and Python's path. We'll start by looking at Python's path and sys.path. When you issue an import, sys.path defines the path that Jython will use to search for the name you are trying to import. The objects within the sys.path list tell Jython where to search for modules. Most of these objects point to directories, but there are a few special items that can be in sys.path for Jython that are not just pointers to directories. Trying to import a file that does not reside anywhere in the sys.path (and also cannot be found in the CLASSPATH) raises an ImportError exception. Let's fire up a command line and look at sys.path.

The first blank entry ('') tells Jython to look in the current directory for modules. The second entry points to Jython's Lib directory that contains the core Jython modules. The third and forth entries are special markers that we will discuss later, and the last points to the site-packages directory where new libraries can be installed when you issue setuptools directives from Jython (see Chapter XXX for more about setuptools). The module that gets imported is the first one that is found along this path. Once a module is found, no more searching is done.

Java Package Scanning

Although you can ask the Java SDK to give you a list of all of the packages known to a ClassLoader using:

there is no corresponding

This is unfortunate for Jython, because Jython users expect to be able to introspect they code they use in powerful ways. For example, users expect to be able to call dir() on Java packages to see what they contain:

To make this sort of introspection possible in the face of merged namespaces requires some major effort the first time that Jython is started (and when jars or classes are added to Jython's path at runtime). If you have ever run a new install of Jython before, you will recognize the evidence of this system at work:

This is Jython scanning all of the jar files that it can find to build an internal representation of the package and classes available on your JVM. This has the unfortunate side effect of making the first startup on a new Jython installation painfully slow.

How Jython Finds the Jars and Classes to scan

There are two properties that Jython uses to find jars and classes. These settings can be given to Jython using commandline settings or the registry (see Chapter XXX). The two properties are:

These properties are comma separated lists of further registry entries that actually contain the values the scanner will use to build its listing. You probably should not change these properties. The properties that get pointed to by these properties are more interesting. The two that potentially make sense to manipulate are:

For the java.class.path property, entries are separated as the classpath is separated on the operating system you are on (that is, ";" on Windows and ":" on most other systems). Each of these paths are checked for a .jar or .zip and if they have these suffixes they will be scanned.

For the java.ext.dirs property, entries are separated in the same manner as java.class.path, but these entries represent directories. These directories are searched for any files that end with .jar or .zip, and if any are found they are scanned.

To control the jars that are scanned, you need to set the values for these properties. There are a number of ways to set these property values, see Chapter XXX for more.

If you only use full class imports, you can skip the package scanning altogether. Set the system property python.cachedir.skip to true or(again) pass in your own postProperties to turn it off.

Compilation

Despite the popular belief that Jython is an “interpreted, not compiled”, in reality all Jython code is turned into Java bytecodes before execution. These bytecodes are not always saved to disk, but when you see Jython execute any code, even in an eval or an exec, you can be sure that bytecodes are getting fed to the JVM. The sole exception to this that I know of is the experimental pycimport module that I will describe in the section on sys.meta_path below, which interprets CPython bytecodes instead of producing Java bytecodes.

Python Modules and Packages vs. Java Packages

The basic semantics of importing Python modules and packages versus the semantics of importing Java packages into Jython differ in some important respects that need to be kept carefully in mind.

sys.path

When Jython tries to import a module, it will look in its sys.path in the manner described in the previous section until it finds one. If the module it finds represents a Python module or package, this import will display a “winner take all” semantic. That is, the first python module or package that gets imported blocks any other module or package that might subsequently get found on any lookups. This means that if you have a module foo that contains only a name bar early in the sys.path, and then another module also called foo that only contains a name baz, then executing “import foo” will only give you foo.bar and not foo.baz.

This differs from the case when Jython is importing Java packages. If you have a Java package org.foo containing bar, and a Java package org.foo containing baz later in the path, executing “import org.foo” will merge the two namespaces so that you will get both org.foo.bar and org.foo.baz.

Just as important to keep in mind, if there is a Python module or package of a particular name in your path that conflicts with a Java package in your path this will also have a winner take all effect. If the Java package is first in the path, then that name will be bound to the merged Java packages. If the Python module or package wins, no further searching will take place, so the Java packages with the clashing names will never be found.

Naming Python Modules and Packages

Developers coming from Java will often make the mistake of modeling their Jython package structure the same way that they model Java packages. Do not do this. The reverse url convention of Java is a great, I would even say a brilliant convention for Java. It works very well indeed in the world of Java where these namespaces are merged. In the Python world however, where modules and packages display the winner take all semantic, this is a disastrous way to organize your code.

If you adopt this style for Python, say you are coming from “acme.com” so you would set up a package structure like “com.acme”. If you try to use a library from your vendor xyz that is set up as “com.xyz”, then the first of these on your path will take the “com” namespace, and you will not be able to see the other set of packages.

Proper Python Naming

The Python convention is to keep namespaces as shallow as you can, and make your top level namespace reasonably unique, whether it be a module or a package. In the case of acme and company xyz above, you might start you package structures with “acme” and “xyz” if you wanted to have these entire codebases under one namespace (not necessarily the right way to go – better to organize by product instead of by organization, as a general rule).

Note: There are at least two sets of names that are particularly bad choices for naming modules or packages in Jython. The first is any top level domain like org, com, net, us, name. The second is any of the domains that Java the language has reserved for its top level namespaces: java, javax.

Advanced Import Manipulation

Import Hooks

To understand the way that Jython imports Java classes we have to understand a bit about the Python import protocol. I won't get into every detail, for that you would want to look at PEP 302.

Briefly, we first try any custom importers registered on sys.meta_path. If one of them is capable of importing the requested module, allow that importer to handle it. Next, we try each of the entries on sys.path. For each of these, we find the first hook registered on sys.path_hooks that can handle the path entry. If we find an import hook and it successfully imports the module, we stop. If this did not work, we try the builtin import logic. If that also fails, an ImportError is thrown. So let's look at Jython's path_hooks.

sys.path_hooks

sys.meta_path

Adding entries to sys.meta_path allows you to add import behaviors that will occur before any other import is attempted, even the default builtin importing behavior. This can be a very powerful tool, allowing you to do all sorts of interesting things. As an example, I will talk about an experimental module that ships with Jython 2.5. That module is pycimport. If you start up jython and issue:

Jython will start scanning for .pyc files in your path and if it finds one, will use the .pyc file to load you module. .pyc files are the files that CPython produces when it compiles Python source code. So, if you after you have imported pycimport (which adds a hook to sys.meta_path) then issue:

Jython will scan your path for a file named foo.pyc, and if it finds one it will import the foo module using the CPython bytecodes. Here the code at the bottom of pycimport.py that makes defines the MetaImporter and adds it to sys.meta_path:

Conclusion

In this chapter you have learned how to divide code up into modules to for the purpose of organization and re-use. We have learned how to write modules and packages, and how the Jython system interacts with Java classes and packages. This ends Part I. We have now covered the basics of the Jython language and are now ready to learn how to use Jython.

Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.