Adding YTEP-0028 for different unit systems.

#58 Merged at 8027b05
Repository jzuhone
Branch
default
Repository yt_analysis
Branch
default
Reviewers
Description

This YTEP outlines a proposal for adding the capability to switch between different unit systems to yt. Once the PR for this functionality is submitted, the link will appear here.

1. Nathan Goldbaum

This is really awesome John! Thanks for the careful design discussion here.

2. Jonah Miller

Hi. @MattT asked me to look at this and comment. I'm assuming his intent was that I'd comment based on the needs of my domain, which is numerical relativity.

This is a really nice, fine-grained system, and I like it a lot. But I think it would probably be challenging to use for numerical relativists.

Correct me if I'm wrong, but it seems as if the intent is that fundamental constants are assumed to have the same units as the cgs case, modulo some SI prefactors? For example, you describe having to handle this problem with magnetic fields in cgs vs. mks units as a special case. And basically the difference is that in cgs units, there's an extra factor of the permeability of free space.

Similarly, relativists and relativity code use the convention that Newton's constant and the speed of light are both unitary and unitless. You can do this, but it throws all the mks conventions out of whack. For example, time, distance, and mass all have the same units and are usually measured in solar masses.

This could be handled at the frontend level, of course, like you suggest for cgs, but it would be really nice to have some mechanism in yt for imposing natural unit systems. One possible straightforward generalization would be to create several classes of common "categories" for units: e.g., 'cgs,' 'mks,' and 'natural,' which make different choices for fundamental constants. And then leave the remaining machinery in place. Then a new unit system would have a category and a set of units within category. Categories could be defined by picking a set of fundamental units, like length, mass, and time for mks, and a way to convert to the fundamental units of another category. Would that make sense?

1. Jonah Miller

I should add: thank you for inviting me to contribute to the discussion. This is definitely relevant to my interests regarding yt.

2. John ZuHone author

First, welcome to yt (from me at least), and thanks for contributing to this discussion!

I read over your comments, and I see where you are coming from. I think within the current framework of yt we can get 80-90% of the way to where you'd like to get. Let me explain:

First, to clear up some yt terminology. The yt unit system is based on the concepts of "units" and "dimensions". For us, "units" are things like:

`"cm"`, `"m"`, `"kg"`, `"Myr"`, `"J/m**3"`, `"ft/s**2"`, `"esu"`

and so on. Dimensions, on the other hand, are things like:

`"length"`, `"mass"`, `"time"`, `"velocity"`, `"energy"`, `"flux"`

and so on. As you can see, our definitions of both of these categories include both the "base" units and dimensions of a unit system and compound combinations of them, as well as prefixed versions in the units category.

The basic goal of the yt unit system is to take quantities with units, perform mathematical calculations on them, and be able to convert between quantities that have different units but the same dimensions. This means between cgs and MKS, but also between cgs and imperial units, or MKS and imperial units, or between a set of units where the base is `"kpc"`, `"Myr"`, and `"Msun"` and MKS, or, you get the idea.

Also, in this scheme, physical constants do not have a special status per se--they just unitful quantities that yt is aware of, but otherwise play no central role in defining units. Obviously, in order to convert between the different systems, we need some sense of what the conversion factors are, which are stored as well, but these are conceptually different from the physical constants themselves.

Which brings us to your description of "geometrized units". Let's take a canonical example to work with. In most of the unit systems I described, the formula for the Schwarzschild radius would be (using the yt names for the physical constants):

```from yt.units import G, clight, Msun
M = 1.0*Msun
R_s = 2*G*M/(clight*clight)
```

The above is actually a set of lines of code that works in yt, in the current development version. But of course, in geometrized units, this would be:

```R_s = 2*M
```

If we check the dimensionality of the first formula, we can see what is happening:

```print(G.units.dimensions)
print(clight.units.dimensions)
print(M.units.dimensions)
print(R_s.units.dimensions)
```

which yields:

```(length)**3/((mass)*(time)**2)

(length)/(time)

(mass)

(length)
```

as we expect. However, if we did the same for the second formula, we'd get this:

```print(R_s.units.dimensions)
```
```(mass)
```

which makes sense, of course, in the scheme you outlined above. But the drawback from the perspective of the yt units system is that there is no way now to know this is really a "length" that we are looking at. The idea is that we want to be able to convert units automatically, without having to worry about what conversion factors to apply ourselves--the code takes care of it.

In this vein, I cooked up a solution within the proposed scheme (and added it to my code that I will submit soon) that is very close (but not quite) to what you are suggesting. In essence, we define a new unit system of "geometrized units", with base units of `"L_geom"`, `"T_geom"`, and `"M_geom"`, subject to the following two constraints:

1. `1 M_geom == 1 Msun`
2. In these units, c = G = 1.

In this system, if we wanted to calculate the Schwarzschild radius, we'd do something like this:

```import yt

# Get the object for the geometrized units system
geometrized_units_system = yt.unit_system_registry["geometrized"]

# Get the constants in this system
G = geometrized_units_system.constants.G
clight = geometrized_units_system.constants.clight

M = yt.YTQuantity(1.0, "M_geom") # this is the same as 1 Msun

# print the constant values
print(G)
print(clight)

R_s = 2*G*M/(clight*clight)

# print it
print(R_s)
```

which yields (apologies for the round-off error):

```1.0 L_geom**3/(M_geom*T_geom**2)

0.9999999999999999 L_geom/T_geom

2.0000000000000004 L_geom
```

So, we can see that `R_s = 2*M` as expected, and that G = c = 1, but that they have units attached that are essentially just serving as placeholders for the dimensions. The value of `R_s` is in a length unit, so we can convert it easily and automatically:

```print(R_s.in_units("m"))
```
```2953.0554297898207 m
```

I realize that this is not exactly what numerical relativists want, since as you mentioned above you would prefer to regard G and c as unitless, and dispense with their use in formulas altogether. But, I think this gets us most of the way there? The advantage to this approach is that it retains the ability to convert the final result automatically to a different unit in the correct dimension, but more importantly, the derived field definitions in the yt code base (for things like angular momentum, kinetic energy, etc.) would "just work" for calculations using this system as well.

We could attempt something like the full scheme you mentioned where the length, mass, and time dimensions where actually the same thing, but this would be a dramatic break from the current way that yt thinks about units and dimensions, and would be a bigger undertaking. Not impossible, though.

3. Jonah Miller

Thanks, @John ZuHone !

What you're describing makes sense to me. The more complete system would be better, since it would remove the burden of adding fundamental constants to formulae from the relativist. However, there's a point where the costs of modifying a codebase outweigh the benefits.

Perhaps as a stopgap, we could also add a utility function which calculates what combination of fundamental constants are required to convert one dimension to another? So that, in the Schwarzschild radius example, you could make a call like

```M = yt.YTQuantity(1.0, "M_geom")
m_to_l = convert_dimensions('mass','length')
R_s = 2*m_to_l*M
```

and it would behave correctly? Naively, I don't think this would be too difficult, and it would help ease the burden of the fact that G and c are unitful. Actually, modulo time constraints, I'd be happy to write the routine.

1. John ZuHone author

This is a good idea--the only thing I would change about it is that we wouldn't need to have a function that does it. The number of combinations of G and c that you'd need are probably small enough that we could just create all of them and put them into a module, e.g.:

```m_to_l = (G/(clight*clight)).in_base("geometrized")
m_to_t = m_to_l/clight.in_base("geometrized")
l_to_m = 1/m_to_l
t_to_m = 1/m_to_t
```

etc., unless you think we'd need to have the ability to have arbitrary combinations.

1. Jonah Miller

I think arbitrary combinations is better. It's true that the obvious choices are just conversions between t,m, and l. But I can easily see the need for different arbitrary combinations, which the user ideally shouldn't have to worry about. For example, what's the relevant conversion of pressure from natural units to cgs. It's force per area, right? But in natural units, force is unitless so pressure is in units of 1/M_sun^2. Ideally "convert_dimensions" should be able to figure that out.

For extensibility reasons, I also like having a function api, rather than a bunch of constants... even if that's all it is under the hood.

I can do some reading on how to implement this. Once I understand how dimensions are implemented in yt, I don't think it should be too difficult. There should be some simple known symbolic algebra algorithms to do it.

1. Jonah Miller

Ah okay, thanks. Since they're already in sympy, it should be just as straightforward to generate conversion factors for arbitrary combinations... just use sympy to compose the two basic ones (m_to_l and l_to_t).

1. Nathan Goldbaum

It's not really straightforward to declare two sympy symbol objects to be equal to each other.

OTOH, if you could define a set of dimensions along with the unit system, it would be straightforward to do what you want to do - the unit system would declare its base dimensions and the relationships between them.

Right now we only have one global set of dimensions (the ones I linked to above). It would probably be possible to refactor the dimension system to make it possible for unit system objects to define custom sets of dimensions, but there are definitely tricky parts. How would I convert units created from a simulation with one set of dimensions to one in another unit system?

In any case, it would take a bit of work, and I think is mostly orthogonal to the direction John is going with this YTEP. I also think it will be easier to implement what you're looking for once this YTEP is implemented, since there will be first-class unit system objects.

1. Jonah Miller

Sure, that seems reasonable to me. I admit I hadn't thought very hard about it. I'll keep my eyes on this YTEP and stay involved. But I'll hold off on implementing anything new.

4. MattT

This came up during the triage today, and we think that the current status is that this is implemented, and that subsequent work and thought has to go in for the general units that Jonah brought up above.