Clone wiki

clReflect / GetType Discussion

Overview

This was triggered by an email from xxuejie :

From what I see, in the current implementation, for each specialization of GetType, the compiler just generates these two assembly codes:

			mov eax, dword ptr [hash]
			ret

And during the loading of database, you would patch the first mov instruction with the right address.

But we got following problems when porting to gcc:

First, gcc does not have "naked" options. Well, at least on x86 and x86-64 they do not support naked attribute. So gcc would also add extra assembly code to GetType, thus breaking original patching code.

At http://gynvael.coldwind.pl/?id=15, there's a simulated naked attribute on gcc. But since our solution uses template, this cannot be used, either.

Currently my idea is to inline an assembly code like following:

mov eax, 0xFEDC7654                // This may not be right assembly syntax, I only use this for demonstration purpose

And when we are patching the function, we search from the start of the function for the value "0xFEDC7654", when we found it, we patch it with the correct type address.

But this solution always suffers with a problem: If the generated assembly code somehow has another "0xFEDC7654" somewhere, we cannot distinguish it from the value we put there, so we must pick a very good value to use here.

My personal opinion at the moment is:

  • GetType and GetTypeNameHash should have multiple implementations and the correct place for them is no longer in clReflectCpp. Different implementations should instead be placed in clReflectUtil.

My reasoning behind this is that there are many ways of achieving this that may not match what people desire. To encourage a broader use of the clReflect core libraries, pulling them out may be beneficial. Without GetType and GetTypeNameHash support, the equivalents can be achieved at runtime, using the database:

clcpp::Database db;

// Get the name object, which contains the interned string pointer and hash value
// If the type doesn't exist in the database, a null text pointer and zero hash value are returned
clcpp::Name type_name = db.GetName("MyTypeName");

// Lookup the type by type name hash value
const clcpp::Type* type = db.GetType(type_name.hash);

That seems sufficient for general use but is not good enough for efficient runtime use, because:

  • Two binary searches are being performed on the name list and type list.
  • The type name hash is calculated in order to perform the binary search on the name list.
  • The type name string gets added to the generated executable, consuming memory.

The current solution eliminates all of these problems using runtime code self-modification, defining the following templated function:

// When you call this function, the compiler creates a single, shared implementation for the
// type. The address of this function is exported to a MAP file that clExport parses and
// associates with the type.
template <typename TYPE>

// Prevent inlining so that only one copy of the function exists
__declspec(noinline)

// To ensure the self-modification code can predictably parse the x86 opcodes, prevent the
// compiler from generating extra epilogue/prologue code.
__declspec(naked)

unsigned int GetTypeNameHash()
{
   // Each type gets its own instance of this variable. At compile-time it's initialised to
   // zero. At runtime, when the database is loaded, it is located and changed to the value
   // of the type name hash.
   static unsigned int hash = 0;

   // Set return value manually in ASM so that we can parse the x86 opcodes reliably, look
   // for the 'mov' instruction, get the address of the source operand and modify the
   // hash value.
   __asm
   {
      mov eax, dword ptr [hash]
      ret
   }
}

It is almost close to the ideal solution, allowing you to do:

const clcpp::Type* type = clcpp::GetType<MyTypeName>();

It's constant-time with no searches, no hashing and no requirement to store strings in the runtime executable. Its problems are:

  • It's highly platform and compiler specific, with GCC on x86 not supporting the 'naked' attribute. I really want to avoid as much platform-specific code in the core libraries as possible.
  • The validity and security implementations of self-modifying code vary wildly between platforms, with some requiring obscure tricks such as using the GPU to bypass OS memory protection.
  • When using GetType, the database used for retrieval is bound to the module that loads the database. In a scenario of multiple modules (DLLs), this may cause issues.

Of course, the first question would be how do we port it? But I want to open the discussion up further to include how can we derive an alternative that doesn't have as many problems?

How do we port this?

The first step in porting this is pulling it out of clReflectCore and into clReflectUtil. The main problem with this is that clReflectExport knows that it should be looking for calls to GetType and GetTypeNameHash and embeds their addresses in a special table within the clReflect runtime database.


Question 1: Is there a simple way to achieve this?

One way I'm thinking about is allowing clReflect to reflect templated functions so that we can eliminate the custom table and remove custom association code from clReflectExport. -- dwilliamson


Question 2: Would it actually be simpler, for now, to do a straight port of what exists to GCC/x86/x64? Mac OS X would come later and other platforms: much later!

Instead of using naked functions, we could change the address of the GetType/GetTypeNameHash functions to runtime generated ones. This would get around the need to parse the opcodes and also plays a little nicer with OS memory protection schemes. -- dwilliamson


Based on a small experiment, we cannot simply patch a codepage in Mac OS X(I didn't test on Linux but personally I would expect a similar behavior). We need to use mprotect to make the page writable, during which sys/mman.h needs to be added. Based on this, I'm pro new generated functions here. FYI, here's a very nice talk relating to this topic: https://vimeo.com/14951625. Thank Joe Damato for his great work! -- xxuejie


But here's still another problem: suppose in database we have a type A, and in users' own code they have:

const A* type = clcpp::GetType<A>();

We will alss need to patch the code here to call the newly generated functions! So either way we may need to patch the runtime code. -- xxuejie


It's only just occurred to me: this is not self-modifying code. There are two distinct steps:

  1. Parse executable opcodes.
  2. Modify the hash value.

For step 1, if the page is readable, this isn't a problem. For step 2, the hash value is a static and hence stored in the BSS section by most linkers (it's zero-initialised). This is page readable on most platforms.

As long as we can reliably get hold of the address by parsing the assembly, this should be portable. -- dwilliamson

Question 3: Can we disable this entirely for other platforms while we think of an air-tight porting strategy. Meanwhile, the slow fallback can be used for other platforms. -- dwilliamson


I also suggest we provide one new API here:

template <typename TYPE>
const clcpp::Type* GetType(const clcpp::Database& db, const char* type_name_string);

Almost as the one in C++ RTTI example. Notice that when GetType is usable, we must have already loaded the corresponding database and patched relating functions. In other words, we have a valid database instance already, and the user should not feel hard to change to this new API. Within this newly added API, we can write like this:

template <typename TYPE>
const clcpp::Type* GetType(const clcpp::Database& db, const char* type_name_string)
{
#ifdef CLCPP_HAS_OPTIMIZED_GETTYPE
   return GetType<TYPE>();
#else
   // Maybe by parsing __FUNCSIG__ and __PRETTY_FUNCTION__, we can eliminate the need for type_name_string
   clcpp::Name type_name = db.GetName(type_name_string);
   return db.GetType(type_name.hash);
#endif
}

This way, we provide a unified API for all platforms, and we are leveraging the optimized version when possible. We can still provide the original no-arg version on supported platform(MSVC, etc). But if developers need platform-independent feature, they should use this API instead. -- xxuejie


What are the alternatives?

C preprocessor macros

Introduce the following macros:

#define GetTypeNameHash(type_name) clcpp::internal::HashNameString(#type_name)
#define GetType(db, type_name) db.GetType(db.GetName(GetTypeNameHash(type_name)).hash)

This wraps the basic implementation with all its warts into a couple of pre-processor macros, using the stringizer operator to get the name of a type. This suffers all the drawbacks covered initially, as well as carrying the inability to be called from within templated functions.

Suggestion: Avoid. Not worth discussing any further.

C++ RTTI

We can extend the above to work with templates by using the type names returned by the C++ RTTI libraries:

// Required for using of the object returned by typeid
#include <typeinfo>

template <typename TYPE>
unsigned int GetTypeNameHash()
{
   const char* name = typeid(TYPE).name();
   return clcpp::internal::HashNameString(name);
}

template <typename TYPE>
const clcpp::Type* GetType(const clcpp::Database& db)
{
   unsigned int type_name_hash = GetTypeNameHash<TYPE>();
   return db.GetType(type_name_hash);
}

On the surface, this is a pretty nice and simple way of wrapping GetType access. However, the initial performance limitations remain and there are a few other problems:

  • This depends on including the C++ typeinfo header and so far, I've managed to avoid including any external headers in the core libraries.
  • The name returned by typeid will likely vary between compilers, having various prefixes that don't match the correct type name.
  • Requires RTTI to be enabled. Although there will be no runtime cost to doing this, the executable may increase in size due to the added library overhead and typeinfo storage.

Assuming we ignore threading issues for now, performance can be improved pretty easily by using static local variables to cache results:

template <typename TYPE>
unsigned int GetTypeNameHash()
{
   // Only on the first call to this function is the hash calculated
   static unsigned int hash = 0;
   if (hash == 0)
   {
      const char* name = typeid(TYPE).name();
      hash = clcpp::internal::HashNameString(name);
   }

   return hash;
}

template <typename TYPE>
const clcpp::Type* GetType(const clcpp::Database& db)
{
   // Only on the first call to this function is the hash calculated
   static const clcpp::Type* type = 0;
   if (type == 0)
   {
      unsigned int type_name_hash = GetTypeNameHash<TYPE>();
      type = db.GetType(type_name_hash);
   }

   return type;
}

The main performance hit will only be felt once per type and the strings will still need to be stored in memory, but it's a provably shippable solution.

At runtime, you can also strip the compiler-specifix prefixes from the type names before submitting them for hash calculation:

const char* name = typeid(TYPE).name();
if (name[0] == 's' && name[6] == ' ')
   name += sizeof("struct");
else if (name[0] == 'c' && name[5] == ' ')
   name += sizeof("class");
else if (name[0] == 'e' && name[4] == ' ')
   name += sizeof("enum");

This technique was actually used in an earlier version of clReflect, as seen in this change:

https://bitbucket.org/dwilliamson/clreflect/changeset/da718eb30b48#chg_inc/crcpp/crcpp.h_newline82

Suggestion: Most problems with this have been solved to a satisfactory degree, except the external dependency on the typeinfo header. Worth further discussion.

I have no idea on the cost of RTTI so I cannot tell, but I really wonder if it is worth doing this. -- xxuejie

Simulate C++ RTTI type name retrieval

To relieve the dependency on the typeinfo header, a function similar to this can be introduced:

template <typename TYPE>
const char* GetTypeName()
{
#ifdef _MSC_VER
   return __FUNCSIG__;
#else
   return __PRETTY_FUNCTION__;
#endif
}

This returns an approximation to the typename in the template parameter list, allowing you to parse it and get the type name. The parsing will be more complicated than for the previous solution and varies much more with compiler. However, porting to other compilers would be pretty straight forward.

FUNCSIG sample output:

const char *__cdecl GetTypeName<int>(void)
const char *__cdecl GetTypeName<struct mynamespace::mytype>(void)

PRETTY_FUNCTION sample output:

const char* GetTypeName() [with TYPE = int]
const char* GetTypeName() [with TYPE = mynamespace::mytype]

This actually looks interesting, we may want to look into this. Actually we can use this to simplify my proposed api above(helps removing type_name_string). -- xxuejie

Use clang as a pre-processor

Currently you can run clReflectScan on any C++ file independently of your main compiler. There is no order of compilation or dependency enforced on the build system for this. I'm keen to keep it this way as build integration is much simpler.

One option if we want to bypass this is to make clReflectScan transform the C++ code which is then passed onto the main compiler. This will allow us to directly inject code, such as replacing all calls to a GetTypeNameHash function with a unique runtime ID. This runtime ID could then be used to do an array lookup within the database.

An extra bonus is that we get to define a more concise attribute language. But that should not be allowed to influence this decision too much.

Updated