The main point of `py::module_local` is to make the C++ -> Python cast
unique so that returning/casting a C++ instance is well-defined.
Unfortunately it also makes loading unique, but this isn't particularly
desirable: when an instance contains `Type` instance there's no reason
it shouldn't be possible to pass that instance to a bound function
taking a `Type` parameter, even if that function is in another module.
This commit solves the issue by allowing foreign module (and global)
type loaders have a chance to load the value if the local module loader
fails. The implementation here does this by storing a module-local
loading function in a capsule in the python type, which we can then call
if the local (and possibly global, if the local type is masking a global
type) version doesn't work.