New code injection method

Started by Micktu, August 05, 2016, 02:59:26 AM

Previous topic - Next topic

Micktu

I made a proof-of-concept method call patcher: https://github.com/micktu/RimWorld-BuildProductive/blob/master/Source/MethodCallPatcher.cs

What it does is it looks for every call to a specific method in other method's body and replaces it with a method of your choice.

Because there's no way I'm aware of to either to force Mono JIT compilation or to check if the method was compiled, it repeatedly scans the method over time until it finds what it needs. As it's just looping through a few hundreds bytes per tick with direct memory access it's not very taxing; funny thing is, method body does exist before the actual method is compiled, but I'm not sure about what it contains.

This enables more kinds of injections into methods; I'll provide an example of a better detour method than CCL has currently.

  • Read the first bunch of instructions from the source method and save them*;
  • Place a JMP to target method;
  • Allocate some executable memory, place preserved instructions there, followed by a JMP to the source method, to the next instruction after JMP to target;
  • Patch the call to the source method in target method body with a call to our allocated buffer.
* Implies every method is consistent with its first instructions, more research needed.

Et voila, we have a detour that can actually call the base method.

Fluffy (l2032)

This looks very promising, am I correct in assuming that this would allow one to attach events* to method calls? E.g. leave the original call intact, but do something extra when called? With access to the original arguments + instance data?

(I'm aware it's not really an event, but you get the gist).

Micktu

#2
Yeah, pretty much that, you can make hooks/callbacks. That's exactly what I had in mind.

I didn't post an actual usage sample, but in my case, I inject a command before gizmo rendering. Wanted to use it to inject commands into existing Thing classes.

       
        internal static void GizmoGridDrawer_DrawGizmoGrid(IEnumerable<Gizmo> gizmos, float startX, out Gizmo mouseoverGizmo)
        {
            Command_Action command_Action = new Command_Action();
            command_Action.defaultLabel = "Test Icon";
            command_Action.action = delegate
            {
            };
            (gizmos as List<Gizmo>).Add(command_Action);

            GizmoGridDrawer.DrawGizmoGrid(gizmos, startX, out mouseoverGizmo);
        }

1000101

#3
Quote from: Micktu on August 05, 2016, 02:59:26 AM...a better detour method than CCL has currently.

That's a rather opinionated and biased statement.  That it's constantly scanning makes it no "better" in my opinion just different.

The only problem with this I see is that you need to scan every single method to trap all cases as opposed to CCLs current implementation which traps all cases with one change.

Anyway, it sounds promising for certain cases and look forward to seeing some actual working code using this.
(2*b)||!(2*b) - That is the question.
There are 10 kinds of people in this world - those that understand binary and those that don't.

Powered By

Micktu

#4
The current method does not allow to return to the original method. For me it's essential to be able to call the original, i.e. produce an actual hook that does not require to copy-paste decompiled code into the target method, which is the case in CCL right now. It will break constantly and will require a meticulous update on every application update. To sum it up, it doesn't really work.

The snippet above is the sampe of actual tested-to-be-working code I'll be using to inject commands into things.

1000101

CCLs detouring method does not allow calls to base.method, correct.  Doing so will cause a stack overflow due to the call being resolved back into the detour, correct.  The solution, which isn't great, is to include the entire method bodies of all base.method calls into the detour, correct.

However, scanning every method in every DLL plus the game executable 10/sec to patch call sites to trap all calls to your detour is also not the best method and will certainly impose a large performance penalty.

I'm not saying I dislike your idea just that it's sub-optimal for blanket cases.  For "I only want to redirect this one methods calls" situation, it's very good.
(2*b)||!(2*b) - That is the question.
There are 10 kinds of people in this world - those that understand binary and those that don't.

Powered By

Micktu

#6
>scanning every method in every DLL plus the game executable

No, it was never meant to work this way, this is intended to patch calls in a single method. Above I described how to use it with some additions to existing CCL method to make an actual working hook. Basically, you're not patching the game's method, but you're patching your own so you don't get stuck in an infinite loop if you try and call the base one, and return properly.

Anyway, no, there's no real performance impact, the scan might take a few microseconds per frame and that's all. And it stops when it has been patched. I benchmarked a single scan, it takes 1.5-3 microseconds on my PC.

For example, if all required scans for all mods would take 50us, it's still like 0.3% of a tick, which is completely negligible in my opinion. An I'm not even doing it every tick.

1000101

Quote from: 1000101 on August 05, 2016, 04:46:57 AMI'm not saying I dislike your idea just that it's sub-optimal for blanket cases.  For "I only want to redirect this one methods calls" situation, it's very good.

Key statements, important.
(2*b)||!(2*b) - That is the question.
There are 10 kinds of people in this world - those that understand binary and those that don't.

Powered By

Micktu

Maybe I'm bad at explaining things. It's not the complete solution, it's a part of a "blanket" solution that can replace the current hook method.

Fluffy (l2032)

So, let me phrase this in simple(r) terms to see if I understand it.

Let's say we have two methods, Vanilla.Foo( object x ), and MyMod.Bar( object x ). I want to call Bar() whenever Foo() is called.

I'd call your code to create me a patch, which starts the following process in your code;
- Every x ticks, Foo() is scanned to see if the JIT has been compiled yet.
- If Foo() was compiled, we store the first few instructions in Foo(), then override them with a call to Bar().
- We also create a 'new' method, Boo(), that consists of Foo's stored instructions, and a jump back into Foo() ['behind' the jump to Bar(), so we don't end up in an infinite loop].
- Finally, any call to Foo() from Bar() is amended to instead call Boo() [since calling Foo() would cause an infinite loop].

That sounds pretty cool. I have a few (noob) questions though;
- How sure are you that the first few instructions in Foo() won't be affected by relative memory addresses?
- Are you checking the length of Foo() when patching?
- And finally, assuming Foo() can be reliably preserved, would you mind implementing a rudimentary event based system around this? I'm envisioning OnCall( MethodInfo( Foo), delegate( pars ) ) kind of schemes, as well as PreCall and PostCall hooks to intercept and change arguments and return values on their way to and from Foo().

Assuming E and the other CCL people are on board and it all works reliably cross-platform, I think it'd be a very good idea to get this in CCL. Having multiple mods implement these techniques independently, and possibly targeting the same methods is going to be quite the mess.

Also, regarding waiting for JIT compilation. If I understand the process correctly, the main difference between your approach and CCL's is that you salvage the detoured method (which is awesome if it works). What I don't understand is why CCL doesn't have to wait for JIT, and you apparently do. I mean, presumably if the method wasn't compiled and CCL injects the detour jump, compilation would override that. So either we're never run into not yet compiled code, or there's something else going on (probably me misunderstanding something).

Micktu

Your explanation got a couple of thing backwards, but it's not a big deal.

>How sure are you that the first few instructions in Foo() won't be affected by relative memory addresses?
Because the stack will be in the same state as it's supposed to be when we will get there.

>Are you checking the length of Foo() when patching?
No, I just scan till its return.

>would you mind implementing a rudimentary event based system around this
I'm not sure if I will need it, but anyone can do it, it's not complicated.

>What I don't understand is why CCL doesn't have to wait for JIT, and you apparently do.
I don't know, too. Perhaps it's not compilation I'm waiting for, but optimization, so the method gets static addresses I can patch; don't really know much about how JIT is supposed to work here. I'm investigating this at the moment.





1000101

Getting the methodinfo and subsequent method pointer is supposed to actually trigger the JIT itself which makes waiting for it a bit redundant.  However, if this for some reason doesn't happen, then the injector needs to wait for it.  In my experience I have never seen this happen though inside of mono (which is what RimWorld is using).

As to scanning until the return, what happens if it has branches which jump over the return or has multiple returns?  Wouldn't getting the method length from the method info be a more reliable way to do this?
(2*b)||!(2*b) - That is the question.
There are 10 kinds of people in this world - those that understand binary and those that don't.

Powered By

Micktu

>Getting the methodinfo and subsequent method pointer is supposed to actually trigger the JIT
Yeah, I was aware, that's why I was wondering. Turned out to be optimization: the calls are not routed as static until it's executed once.

> if it has branches which jump over the return or has multiple returns?
No, it doesn't. It's just that I haven't seen anything like this in native JIT-compiled code. Have you?

>Wouldn't getting the method length from the method info be a more reliable way to do this?
Does it provide native function size? I didn't think it was possible. Can you point me in the right direction?

Anyway, I'm trying to drop the waiting method and just make it a reliable hook. Trying out a few ways currently.

1000101

Quote from: Micktu>Getting the methodinfo and subsequent method pointer is supposed to actually trigger the JIT
Yeah, I was aware, that's why I was wondering. Turned out to be optimization: the calls are not routed as static until it's executed once.

Under normal .Net that is true, but I've found that mono seems to compile it when getting the MethodInfo.  I can't say this is true under all circumstances but the current detouring works under this assumption and that's the key thing - it works.

Quote from: Micktu> if it has branches which jump over the return or has multiple returns?
No, it doesn't. It's just that I haven't seen anything like this in native JIT-compiled code. Have you?

>Wouldn't getting the method length from the method info be a more reliable way to do this?
Does it provide native function size? I didn't think it was possible. Can you point me in the right direction?

I could have sworn there was a field for the length of the method but I could be wrong.  I just checked the MSDN and there is no field listed, I might be getting my wires crossed.

Quote from: MicktuAnyway, I'm trying to drop the waiting method and just make it a reliable hook. Trying out a few ways currently.

CCL will still need it's current detouring method as it requires a guarunteed detour during it's module initializer.  Those particular detours don't rely on calling any base methods so they don't need to worry about their destruction.  However, a "cleaner" detouring would be nice for "regular" usage which preserves the original or at least makes it accessible.
(2*b)||!(2*b) - That is the question.
There are 10 kinds of people in this world - those that understand binary and those that don't.

Powered By

biship

I don't have 1/10th the coding knowledge you guys have... so posting this here as it might be semi-relevant.
For most moddable games, someone eventually comes out with a performance meter of some kind. Is there one for Rimworld?
I'm looking for, or willing to learn how to make, a dll to monitor the frequency (to start with) mod methods fire.
To paint a picture - a mod is supposed to fire when a pawn's mood changes, yet fires needlessly on every pawn item interaction?
The extension of this would be to determine how long each time each mod activation consumes.

From the code changes I follow on github, I know you guys are aware of the need to optimize your own code. Was just wondering if there is a way to determine the impact of other peoples code ingame. Thanks for any replies.