[LIB] Harmony v1.2.0.1

Started by Brrainz, January 13, 2017, 04:59:21 PM

Previous topic - Next topic

Brrainz

The problem was that the JIT compiler optimized the assembler and Harmony was writing over the end of the JIT buffer. I have used LINQPad to test my CIL code in real time and think that this will solve the problem. Somehow my local PC was not so picky about memory allocation and never triggered the problem.

Harmony + CameraPlus updated on GitHub, but no release yet. CameraPlus is completely build, so a zip-download should be sufficient for testing.

Brrainz

Seems not to work accordingly to the guys on discord. I'm out of ideas - since it always works on any of my machines.  :-\

scuba156

I tested on Win10 and OSX and get the same. Here's an OS X stacktrace:

Receiving unhandled NULL exception
Obtained 38 stack frames.
#0  0x00000101e213a5 in mono_type_get_type
#1  0x00000101decadb in type_check_context_used
#2  0x00000101deca18 in inst_check_context_used
#3  0x00000101dec9c6 in mono_generic_context_check_used
#4  0x00000101d7d8d5 in mono_method_check_context_used
#5  0x00000101d7783a in mono_magic_trampoline
#6  0x0000010a79c171 in (Unknown)
#7  0x00000115e159bf in (Unknown)
#8  0x00000101d0a012 in mono_jit_runtime_invoke
#9  0x00000101e3442a in mono_runtime_invoke
#10 0x00000100bc2a51 in ScriptingInvocation::Invoke(MonoException**)
#11 0x00000100bc2f9b in ScriptingInvocationNoArgs::InvokeChecked()
#12 0x00000100bb0ece in MonoBehaviour::CallAwake()
#13 0x00000100bb1343 in MonoBehaviour::AddToManager()
#14 0x00000100bb0e05 in MonoBehaviour::AwakeFromLoad(AwakeFromLoadMode)
#15 0x00000100bf0cb9 in AwakeFromLoadQueue::InvokePersistentManagerAwake(AwakeFromLoadQueue::Item*, unsigned int, AwakeFromLoadMode)
#16 0x00000100bc16aa in LoadSceneOperation::CompleteAwakeSequence()
#17 0x00000100bc11fc in LoadSceneOperation::PlayerLoadSceneFromThread()
#18 0x00000100bc0e82 in LoadSceneOperation::IntegrateMainThread()
#19 0x00000100bbd8cb in PreloadManager::UpdatePreloadingSingleStep(PreloadManager::UpdatePreloadingFlags, int)
#20 0x00000100bbe1da in PreloadManager::UpdatePreloading()
#21 0x00000100b51e69 in PlayerLoop(bool, bool, IHookEvent*)
#22 0x0000010115e6a1 in -[PlayerAppDelegate UpdatePlayer]
#23 0x007fffab28ef7f in __NSFireTimer
#24 0x007fffa97e5294 in __CFRUNLOOP_IS_CALLING_OUT_TO_A_TIMER_CALLBACK_FUNCTION__
#25 0x007fffa97e4f23 in __CFRunLoopDoTimer
#26 0x007fffa97e4a7a in __CFRunLoopDoTimers
#27 0x007fffa97dc5d1 in __CFRunLoopRun
#28 0x007fffa97dbb54 in CFRunLoopRunSpecific
#29 0x007fffa8d66acc in RunCurrentEventLoopInMode
#30 0x007fffa8d66901 in ReceiveNextEventCommon
#31 0x007fffa8d66736 in _BlockUntilNextEventMatchingListInModeWithFilter
#32 0x007fffa730cae4 in _DPSNextEvent
#33 0x007fffa7a8721f in -[NSApplication(NSEvent) _nextEventMatchingEventMask:untilDate:inMode:dequeue:]
#34 0x007fffa7301465 in -[NSApplication run]
#35 0x007fffa72cbd80 in NSApplicationMain
#36 0x0000010115cfde in PlayerMain(int, char const**)
#37 0x00000100002034 in start
Stacktrace:


Native stacktrace:

0   libsystem_kernel.dylib              0x00007fffbef07dd6 __pthread_kill + 10
1   libsystem_c.dylib                   0x00007fffbee6d420 abort + 129
2   RimWorldMac                         0x0000000100bb4ef4 _Z12HandleSignaliP9__siginfoPv + 36
3   libmono.0.dylib                     0x0000000101dbfece mono_chain_signal + 75
4   libmono.0.dylib                     0x0000000101d0838a mono_sigsegv_signal_handler + 210
5   libsystem_platform.dylib            0x00007fffbefe6bba _sigtramp + 26
6   ???                                 0x00000000000000b0 0x0 + 176
7   libmono.0.dylib                     0x0000000101decadb type_check_context_used + 22
8   libmono.0.dylib                     0x0000000101deca18 inst_check_context_used + 58
9   libmono.0.dylib                     0x0000000101dec9c6 mono_generic_context_check_used + 22
10  libmono.0.dylib                     0x0000000101d7d8d5 mono_method_check_context_used + 37
11  libmono.0.dylib                     0x0000000101d7783a mono_magic_trampoline + 795
12  ???                                 0x000000010a79c171 0x0 + 4470718833
13  ???                                 0x0000000115e159bf 0x0 + 4662057407
14  libmono.0.dylib                     0x0000000101d0a012 mono_jit_runtime_invoke + 1766
15  libmono.0.dylib                     0x0000000101e3442a mono_runtime_invoke + 117
16  RimWorldMac                         0x0000000100bc2a51 _ZN19ScriptingInvocation6InvokeEPP13MonoException + 49
17  RimWorldMac                         0x0000000100bc2f9b _ZN25ScriptingInvocationNoArgs13InvokeCheckedEv + 43
18  RimWorldMac                         0x0000000100bb0ece _ZN13MonoBehaviour9CallAwakeEv + 174
19  RimWorldMac                         0x0000000100bb1343 _ZN13MonoBehaviour12AddToManagerEv + 147
20  RimWorldMac                         0x0000000100bb0e05 _ZN13MonoBehaviour13AwakeFromLoadE17AwakeFromLoadMode + 549
21  RimWorldMac                         0x0000000100bf0cb9 _ZN18AwakeFromLoadQueue28InvokePersistentManagerAwakeEPNS_4ItemEj17AwakeFromLoadMode + 217
22  RimWorldMac                         0x0000000100bc16aa _ZN18LoadSceneOperation21CompleteAwakeSequenceEv + 202
23  RimWorldMac                         0x0000000100bc11fc _ZN18LoadSceneOperation25PlayerLoadSceneFromThreadEv + 636
24  RimWorldMac                         0x0000000100bc0e82 _ZN18LoadSceneOperation19IntegrateMainThreadEv + 114
25  RimWorldMac                         0x0000000100bbd8cb _ZN14PreloadManager26UpdatePreloadingSingleStepENS_21UpdatePreloadingFlagsEi + 363
26  RimWorldMac                         0x0000000100bbe1da _ZN14PreloadManager16UpdatePreloadingEv + 218
27  RimWorldMac                         0x0000000100b51e69 _Z10PlayerLoopbbP10IHookEvent + 921
28  RimWorldMac                         0x000000010115e6a1 -[PlayerAppDelegate UpdatePlayer] + 321
29  Foundation                          0x00007fffab28ef7f __NSFireTimer + 83
30  CoreFoundation                      0x00007fffa97e5294 __CFRUNLOOP_IS_CALLING_OUT_TO_A_TIMER_CALLBACK_FUNCTION__ + 20
31  CoreFoundation                      0x00007fffa97e4f23 __CFRunLoopDoTimer + 1075
32  CoreFoundation                      0x00007fffa97e4a7a __CFRunLoopDoTimers + 298
33  CoreFoundation                      0x00007fffa97dc5d1 __CFRunLoopRun + 2081
34  CoreFoundation                      0x00007fffa97dbb54 CFRunLoopRunSpecific + 420
35  HIToolbox                           0x00007fffa8d66acc RunCurrentEventLoopInMode + 240
36  HIToolbox                           0x00007fffa8d66901 ReceiveNextEventCommon + 432
37  HIToolbox                           0x00007fffa8d66736 _BlockUntilNextEventMatchingListInModeWithFilter + 71
38  AppKit                              0x00007fffa730cae4 _DPSNextEvent + 1120
39  AppKit                              0x00007fffa7a8721f -[NSApplication(NSEvent) _nextEventMatchingEventMask:untilDate:inMode:dequeue:] + 2789
40  AppKit                              0x00007fffa7301465 -[NSApplication run] + 926
41  AppKit                              0x00007fffa72cbd80 NSApplicationMain + 1237
42  RimWorldMac                         0x000000010115cfde _Z10PlayerMainiPPKc + 638
43  RimWorldMac                         0x0000000100002034 start + 52

Brrainz

I am starting to wonder if it is somehow build/project related. Would anybody here try this: using the latest version of both projects on GitHub, just copy all .cs files from both into a new project just like you would for your own mods and run it? Only dependencies are the usual two Rimworld dll's.

Brrainz

Just for reference:

I removed Mono installation from my PC, started VS 2015 Enterprise, created a new project (.NET 3.5), went to GitHub and did a "Clone or download -> Download ZIP" there, copied all the downloaded Harmony .cs files, copied the Main.cs from CameraPlus, set the unsafe flag in the project settings and build a Debug build. Then I put it into my RW,  enabled the new mod (only Core otherwise), did not even quit RW and went into an existing game and it zooms perfectly. No crash. That's all I did, nothing more.

My PC: Intel i7-3770, 16GB 64-bit x64-processor, freshly restarted.

So where is the fucking problem?

scuba156

I'll do some test builds in about 10 mins. I've been using VS2017 community, but I'll give 2015 a go without mono and see if it helps.

PC specs: Win 10 x64, i7 6700k x64, 16Gb

Brrainz

Quote from: scuba156 on January 17, 2017, 03:09:01 AM
I'll do some test builds in about 10 mins. I've been using VS2017 community, but I'll give 2015 a go without mono and see if it helps.

PC specs: Win 10 x64, i7 6700k x64, 16Gb
At this point, I am quite sure it is somehow the logic of the code that inserts the first jump and then the second jump. It may got f*cked up after refactoring and contains a bug that just triggers on architectures different from me. It's not that the code is huge or complicated but I had no time to double check. A friend had followed the above and it still crashes so I think it's pointless for you to just replicate.

scuba156

Quote from: pardeike on January 17, 2017, 03:29:11 AM
Quote from: scuba156 on January 17, 2017, 03:09:01 AM
I'll do some test builds in about 10 mins. I've been using VS2017 community, but I'll give 2015 a go without mono and see if it helps.

PC specs: Win 10 x64, i7 6700k x64, 16Gb
At this point, I am quite sure it is somehow the logic of the code that inserts the first jump and then the second jump. It may got f*cked up after refactoring and contains a bug that just triggers on architectures different from me. It's not that the code is huge or complicated but I had no time to double check. A friend had followed the above and it still crashes so I think it's pointless for you to just replicate.
Can confirm that is still crashes.

I have no experience with assembly and memory allocation so I don't believe I would be of much use prior to testing.

skullywag

Brrainz myself and Fluffy debugged this last night on the modders discord and Brrainz figured out the issue with the crashes. Expect an update from him soon. He was very excited obviously but it was late when we figured it out.
Skullywag modded to death.
I'd never met an iterator I liked....until Zhentar saved me.
Why Unity5, WHY do you forsake me?

Brrainz

Hej guys,

I had some time documenting the crash in Harmony. What I did was test a debug version of Harmony (checked in) with two mods of mine: CameraPlus and SameSpot. I used offline-RW with the debug mono.dll version to break into VS but I think the crash happens even in the steam version.

I have documented the process as close as possible with some memory debug info from Harmony from before and after and all the memory locations. Here is a gist for the report:

https://gist.github.com/pardeike/ec9ffb349379390b46d94f787bbdeb90

RawCode, skullywag, anybody with some intimate knowledge on why this happens with DynamicMethod, could you please have a look at clarify things? This is no longer related to the jumps or memory allocation or such but with DynamicMethod "changing". I do pin all managed objects to a global static array so its hard to believe that GC is the culprit here.

As a short refresher: Harmony adds a jump from the original method to an intermediate memory, and from there a jump to the wrapper (DynamicMethod) that is custom build to match the signature of the original method. That wrapper calls all patches and of course the copy of the original (also a DynamicMethod). It is that wrapper that is changing.

I think this is the last bug in Harmony.
/Andreas

RawCode

Each method's life have 3 stages:

metadata record, no code at all; - before calling "compile_method, that called GetFunctionPointer() in managed;
code filled with trampolines; - before calling method at least once
actual code with actual references; - after method actually completed at least once

Dynamic methods are not any special, when method is compiled, it's still filled with trampolines and calling trampoline cause runtime to update call site.
As far as i remember, runtime always emit trampolines, ever if have compiled version of method in cache.

RuntimeHelpers.PrepareMethod(targetMethod.MethodHandle);
and all code changes related to this method, well, i have bad news for you
[MonoTODO("Currently a no-op")]
public static void PrepareMethod (RuntimeMethodHandle method)
{
}

[MonoTODO("Currently a no-op")]
public static void PrepareMethod (RuntimeMethodHandle method, RuntimeTypeHandle[] instantiation)
{
}

Fetching x86 for this method from game will give you noop function also.
All core related to usege of preparemethod is invalid for mono runtime.

Code allocated for functions, allocated from native pool, not heap, in addition, it's permanent and never deallocated, ever if dynamic method is lost, it's data will stay as long as domain hosting data alive.

To detect why and how, just "protect" dynamic method memory from writing with virtual protect, and you will see trace with method name\chain that caused modification.

Brrainz

Quote from: RawCode on January 19, 2017, 06:52:26 AM
To detect why and how, just "protect" dynamic method memory from writing with virtual protect, and you will see trace with method name\chain that caused modification.
That is a good idea, RawCode. One question I have though:

Assuming that the old way of Detour works by simply writing a jump from original method to replacement method, why would the Harmony code not work?

It only writes jumps too (the extra step in between can't be the problem).

The difference is that the old Detour uses replacement methods that are from the loaded Assembly of the mod and Harmony uses a DynamicMethod that generates the method instead.

You said that DynamicMethods are not special. But that means that the jump that triggers all the things you explained in your post would do the same with a DynamicMethod. I assume that the assembler entry point of a method (dynamic or not) is always what is returned from GetFunctionPointer() and will never change.

But redirecting the assembler flow from original to replacement is different for assembly loaded methods and dynamic methods. Which means that they are not treated the same. Do you follow my logic here or is there something I am missing?

Brrainz

I also found these two articles:
https://www.codeproject.com/Articles/37549/CLR-Injection-Runtime-Method-Replacer
http://www.ntcore.com/Files/netint_injection.htm

EDIT: I am at work so no way to test this, but I found this on stack overflow: http://stackoverflow.com/questions/13655402/dynamicmethod-prelink :

Marshal.Prelink(MethodInfo);

which the author claims only works in mono (yay!) - could this be the solution?

RawCode

no MS runtime code injection stuff works on mono and vice versa.

ICALL_EXPORT void
ves_icall_System_Runtime_InteropServices_Marshal_Prelink (MonoReflectionMethod *method)
{
MONO_ARCH_SAVE_REGS;
prelink_method (method->method);
}


static void
prelink_method (MonoMethod *method)
{
const char *exc_class, *exc_arg;
if (!(method->flags & METHOD_ATTRIBUTE_PINVOKE_IMPL))
return;
mono_lookup_pinvoke_call (method, &exc_class, &exc_arg);
if (exc_class) {
mono_raise_exception(
mono_exception_from_name_msg (mono_defaults.corlib, "System", exc_class, exc_arg ) );
}
/* create the wrapper, too? */
}


prelink works only for platform invoke (native\extern) methods and do nothing for managed methods.

probably your problem related to usage of MS based code, that does not work.

i still not ever peeked on harmony code, probably there is some obvious error related to some obvious thing like not calling native constructor or calling native constructor with invalid arguments.




Brrainz

I know you are sceptic but I have coded professionally for like 28 years and I triple checked the code. There might be some edge cases with strange argument types and classes but that usually results in a simple illegal opcode error. Besides the jumps and the code that executes only if you patch a method again (not the scope of this test and not executed) there is nothing assembler related in Harmony. The debug info would also show any abnormal use of the assembler routines or the statics being reused because of the library being cached in the assembly cache. I think one could even comment out the extra code that handles regenerating the il code and storing information for that. Which would mean that one could use the ordinary Detour as used by i.e. HugsLib.

I might try that tomorrow to proof that it's not any funky stuff involved. My way forward now is to read and understand mono - both .cs and .c sources. StarWars debugging: "Use the source, Luke!"