| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

Interacting with the Original Code

Page history last edited by James Koppel 4 years, 6 months ago

Much of this can be learned from the video tutorial, Modding with an Iron Fist: Episode 4: Modding the Code ( http://www.youtube.com/watch?v=dFTT0id4Iz0 )

 

The power of Project Ironfist comes from our ability to modify the game's source code. While this initially required a highly complicated process, today we can interact with and override the game code about as easily as easily as any other programming task.

 

Introduction

 

The main idea of Project Ironfist is that we start with a disassembly of the original game, and then modify it. However, instead of modifying the assembly directly, we've set up infrastructure so that we can piecemeal move individual functions and data structures into C++. This makes modifying the game mostly the same as normal programming, while taking advantage of all existing code.

 

If you have already completed the Getting Started with Coding tutorial, then you will already have added heroes2.asm and heroes2_imports.inc to your workspace. (If you wish to modify the map editor, the corresponding files are editor.asm and editor_imports.inc.)

 

heroes2.asm is our special disassembly of the original game. Among many other things, it wraps all function definitions in compile-guards (similar to #ifndef in C). For example,

IFDEF IMPORT_?CanFit@army@@QAEHHHPAH@Z
?CanFit@army@@QAEHHHPAH@Z PROTO SYSCALL
ELSE
?CanFit@army@@QAEHHHPAH@Z proc near SYSCALL 
...
ENDIF

This allows us to control whether to use the original version of the army::CanFit method, or our own version, simply by setting a flag.

 

You should never need to modify heroes2.asm. Instead, all of those flags are placed in heroes2_imports.inc, a much smaller and simpler file. Details on how to do this are in the "Modifying Functions and Data from the Original Game" below. This also allows us to write multiple programs that interact with game's source code. For example, our h2guiviewer program uses the game's GUI library to provide a viewer for its GUI definition files.

 

You can find a decompilation of the game (HEROES2W.c) and reverse-engineered versions of its types (HEROES2W.h) in the ironfist/src/raw_decompiled directory, periodically updated as the information becomes more accurate. You can browse it to find pieces of the game code you'd like to use or override.

 

 

Using Functions and Variables from the Original Game

 

Using functions and variable from the original game is easy: just declare them extern.

 

First, look through the decompilation in HEROES2W.c for the function or variable you wish to use. However, they may not be properly declared in the decompilation. Find the corresponding declaration in heroes2.asm in order to find its decorated name, which contains full information about the signature. Paste this name into c++filtjs ( https://demangler.com/ ) to decode the decorated name into the signature. Once you have the signature, declare it extern. If it refers to any special classes or structures, you'll need to declare them -- copy in the definitions from HEROES2W.h .

 

For example, suppose we wish to use the army::CanFit method.  We search for "CanFit" in heroes2.asm and find that its full mangled name is ?CanFit@army@@QAEHHHPAH@Z . We feed this into c++filtjs and obtain that its signature is  public: int __thiscall army::CanFit(int,int,int *) . We now find the definition of army and modify it:

 

class army {
...
public:
...
  int CanFit(int,int,int *);
};

 

Note that all C++ methods have the  __thiscall calling convention by default, and extern linkage by default, so both keywords are omitted.

 

You will now be able to call army::CanFit just like any other method.

 

As another example, suppose we wish to access the global gpCombatManager object which contains all information about the current battle. In HEROES2W.c, it is declared as combatManager* gpCombatManager. Searching for gpCombatManager in heroes2.asm, we find that its decorated name is ?gpCombatManager@@3PAVcombatManager@@A, which c++filtjs tells us is indeed class combatManager * gpCombatManager. We first need to copy in the definition of combatManager from HEROES2W.h (but declaring it class instead of struct). We will now be able to access gpCombatManager by making the following declaration:

 

extern combatManager *gpCombatManager; 

 

Modifying Functions and Variables from the Original Game

 

 

You can redefine functions and variables almost as easily as you can import them -- you simply define them, and then instruct the original code to use the new version. An optional, but helpful first step before you do this is to announce your plan to the list. This is because a skilled decompiler-user will be able to clean up the code you want to modify a lot faster than someone with only the decompilation, and it can save you some time. The best decompilations look almost exactly like the original source code.

 

Continuing with the above example, suppose we wanted to redefine the army::CanFit method. We first declare it as above, but not extern. Since extern and non-extern methods are declared the same way, we do not need to change anything here. We now define it. We can start by using the decompiled definition of army::CanFit from HEROES2W.c, and modifying it.

 

int army::CanFit(int hex, int mayShiftTwoHexers, int *rearHex) {
  bool result; // eax@6
  int secondHex; // [sp+10h] [bp-8h]@11
  int othSecondHex; // [sp+10h] [bp-8h]@23
  hexcell *tile; // [sp+14h] [bp-4h]@1

  tile = 0;
  if ( rearHex )
    *rearHex = hex;
  ...
}

 

Note that the decompilation is to a C-like pseudocode rather than C++, so we'll need to modify the definition a bit to make it valid C++, and then clean it up some more to become good C++. For example, we might change the call

 

army::GetAdjacentCellIndex(this, hex, (unsigned int)(this->facingRight - 1) < 1 ? HEX_NEIGHBOR_RIGHT : HEX_NEIGHBOR_LEFT) 

 

to

 

GetAdjacentCellIndex(hex, facingRight ? HEX_NEIGHBOR_RIGHT : HEX_NEIGHBOR_LEFT)

 

. Meanwhile, to redefine gArtifactNames,  which controls the names of artifacts, we can find its definition in HEROES2W.c and simply copy it in:

 

char *gArtifactNames[] = {
    "Ultimate Book of Knowledge",
    "Ultimate Sword of Dominion",
    ...   
    "Spade of Necromancy"
};

 

All that's left is to instruct the original code to use our new version. We'll need to add one line to heroes2_imports.inc.

 

Modifying heroes2_imports.inc

 

heroes2_imports.inc is an Assembly include file which instructs the assembly to import functions and data from C++ instead of using its own definitions. Every declaration has an associated flag controlling whether to use the assembly or C++ version, consisting of the decorated name with "IMPORT_" prepended. All we need to do is define this flag. For example, to override army::CanFit, which has the decorated name ?CanFit@army@@QAEHHHPAH@Z, we would add the following line

 

IMPORT_?CanFit@army@@QAEHHHPAH@Z = 1

 

Wrapping Original Functions

 

Sometimes when modifying a function from the original game, you'd like to call the original function somewhere.

 

For example, here's how we modified the army::MoveAttack method to support the harpy's "strike and return" ability.

 

void army::MoveAttack(int targHex, int x) {
	int startHex = this->occupiedHex;
	this->MoveAttack_orig(targHex, x);

	if( !(this->creature.creature_flags & DEAD) &&
		CreatureHasAttribute(this->creatureIdx, STRIKE_AND_RETURN)) {
		MoveTo(startHex);
	}
}

 

Whenever you declare a symbol as imported in heroes2_imports.inc,  it will create a copy of the original definition under the symbol name "<symbol_name>_clone" . This name will typically not be a valid mangled C++ method name, but it can be renamed to something that is, and then invoked.

 

For example, the following two lines both mark the army::MoveAttack method as imported and creates a method with the original behavior under the name army::MoveAttack_orig .

 

IMPORT_?MoveAttack@army@@QAEXHH@Z = 1
?MoveAttack@army@@QAEXHH@Z_clone EQU ?MoveAttack_orig@army@@QAEXHH@Z

 

Then, in the army class, we declare both methods.

 

  void MoveAttack(int,int);
  void MoveAttack_orig(int,int);

 

 

An Exception

 

Remember when I said you should never have to modify heroes2.asm? Well, there's one case when you will -- when redefining data items which have nonstandard types.

 

It's easiest to give an example. Have a look at the definition of gMonsterDatabase, which contains all the creature stats.

 

IFDEF IMPORT_?gMonsterDatabase@@3PAUtag_monsterInfo@@A
extern ?gMonsterDatabase@@3PAUtag_monsterInfo@@A : dword
ELSE
?gMonsterDatabase@@3PAUtag_monsterInfo@@A tag_monsterInfo <20, 21h, 11h, 12, 1, 0, 2, 1, 1, 1, 1, 0, 'psnt', 0, 0>
...
ENDIF

 

If the IMPORT_?gMonsterDatabase@@3PAUtag_monsterInfo@@A flag is defined, this declares gMonsterDatabase as a dword. Else, it is declared as a tag_monsterInfo. Since the rest of the assembly uses it as a tag_monsterInfo, if we decide we want to redefine gMonsterDatabase, we'll get errors. Since we do redefine it, in order to add new creatures and edit their stats, we need to modify this declaration

 

extern ?gMonsterDatabase@@3PAUtag_monsterInfo@@A : tag_monsterInfo

 

When we find problems like this, we normally just change the script that generates the assembly to fix it. Unfortunately, in the API that script uses, it is not possible to determine the type of data items. We've contacted the creator and hope to get this resolved, but for now, we can modify it by hand --this issue occurs rarely.

 

Another Exception: Finding the Decorated Name of a Function

 

The mangled name—or decorated name—of a function is used when we want to import and override a function to modify its behavior or when we want to call the function as an external function. On some occasions, a function's name in the ASM file may not be properly decorated. In such a case, you will need to find the correct decorated name, then modify the ASM file to contain the correctly decorated name. To find the correct decorated name, you can use error messages generated from the linker.

To generate this error, simply declare the function as extern, using the signature found in the decompilation (since we don’t have a decorated name to decode with c++filtjs). Next, you simply attempt to build the solution, and a linker error should appear for every function that hasn’t been properly decorated yet. Within the description of the linker error, the proper decorated name of the desired function or variable can be found. Once the decorated name has been found, you will need to modify the ASM and update every occurrence of the original name of the function with the new, decorated name. If you wish to simply call the function as an external function, you should only need to modify the ASM; however, if you need to modify the behavior of the function, you will need to modify the corresponding imports file (either heroes2_imports.inc or editor_imports.inc).

Also, you must also make sure that the calling convention in the declaration of a function has been changed from

<decoratedName> proc near C” to “<decoratedName> proc near SYSCALL”.

For variables and other data structures, you must make sure that they are properly imported at the beginning of the ASM file with this format:

IFNDEF IMPORT_<decoratedName>
PUBLIC <decoratedName>
ENDIF

Additionally, if the name of the function or variable needs to be changed, it’s as simple as a String replace of every occurrence of the original identifier to change the desired name of the function or variable. For example:

?dword_48F6B8@@3HA” would be changed to “?OriginalSpell@@3HA”.

Alternatively, you can simply change the identifier before generating the linker error, saving you some steps. The identifier in the declaration is not as important as the structure and location of the declaration, so this is okay.

May it also be noted that this process is applicable to variables and other data structures as well.

 

 

Pitfalls

 

We've developed an astoundingly easy way of modifying programs without their source, but there are still many things that can go wrong. Here are some common sources of errors.

 

Structure Layout

 

The original game code expects the information in a structure to be laid out in a certain way. Our new code expects the information in a structure to be laid out in a certain way. If these are not the same, there will be bugs. This is by far the biggest thing you can mess up.

 

A few of the game's major data structures, such as the resource manager, have been "lifted" so that they never interact with assembly, and can be modified freely. We used to partition the game's code a tied section that had to respect the original game's layout and a lifted section that could be modified freely, but this proved too cumbersome. Luckily,  for top-level classes and structures such as game and combatManager, you should be able to fairly freely add new fields to the end (but not the middle!). Additionally, you can use the trick of replacing statically allocated arrays with dynamically allocated ones. For example, in the original game's code, each hero contained an array of 65 bytes marking whether the hero knew each spell. When we wanted to add a 66th spell, we replaced this with a pointer to a malloc'd array, followed by 61 (or rather,  ORIG_SPELLS-sizeof(char*)) unused bytes. Thus, while we needed to change every method that directly accessed the spellsLearned field, we did not need to disrupt the rest off the hero structure. (Unfortunately, we also had to change the save-game format to not simply do a binary dump of the hero structure.) 

 

The reverse-engineered types in HEROES2W.h are declared in the same way as the original game code, so you can mostly dodge this problem by preserving them. However, for optimization, newer versions of C++ may reorder the fields or insert padding. We can use a preprocessor directive to stop this. Add the following line before your structure declarations:

 

#pragma pack(push,1)


and the following line after

 

#pragma pack(pop)

 

Enum Size

 

We declare many fields of the reverse-engineered types as enums to help with code maintainability. However, this obscures the size of the field. When copying an enum definition from HEROES2W.h, make sure to keep its size specifier. For example, the elements of CREATURE_ATTRIBUTES occupy one byte, and thus it uses the following declaration:

 

enum CREATURE_ATTRIBUTES : __int8
{
  ATTR_MIRROR_IMAGE = 0x1,
  ...
};

 

Any enum which does not have its size specified should be assumed to be 4 bytes (int size), but the compiler may also shrink it. You can prevent this by adding ": int" or ": DWORD" to the declaration.

 

Structs and Classes

 

In C++, struct Foo {...} is equivalent to class Foo { public: ...} -- for everything except name decoration. Everything in HEROES2W.h is declared as a struct, but you'll need to correctly declare something as class or struct in order for your declarations to have the correct decorated name.

 

Additionally, all inheritance information is lost; all fields from the parent classes are redeclared in the child. For example, here are the declarations of widget and border:

 

struct widget
{
  widgetVtable *vtable;
  heroWindow *parentWindow;
  ...
  __int16 height;
};

struct border
{
  widgetVtable *vtable;
  heroWindow *parentWindow;
  ...
  __int16 height;
  bitmap *bitmap;
  icon *icon;
  __int16 color;
};

 

We need to declare both as classes instead of structs, and make the inheritance explicit. Additionally the virtual function table will be inserted automatically when we declare a virtual function (see below), so we omit it. Here are the new declarations:

 

class widget
{
public:
  heroWindow *parentWindow;
  ...
  __int16 height;
};

class border : public widget
{
public:
  bitmap *bitmap;
  icon *icon;
  __int16 color;
};

 

Pointers and Arrays

 

C and C++ try to treat pointers and arrays the same, and, for the most part, they are indistinguishable. This extends to the name decoration -- an int* and an int[] will have the same decorated name. However, they are stored differently in memory, and different machine code will be used to access them. If you declare something that should be an array as a pointer, or vice versa, there will be bugs.

 

HEROES2W.c is fairly accurate at which is which. Hardcoded tables of information are usually arrays, while everything else is a pointer.

 

Writable String "Constants"

 

If you have a line of code like:

 

printf("Hello, world!");

 

That string constant will be stored in memory. The original game will store them in the ".data" section of the executable, so that the "constants" can be overwritten and modified. Newer versions of Visual C++ will store it in the ".rdata" section, which is read-only.

 

Some Heroes II functions may try to modify a string you pass in, even if it's declared "const char *" (possibly because of bad type inference). Hence, fairly innocent method calls may crash, even if seemingly identical ones work in the original game. I believe this comes up both when writing messages at the bottom of the Map Editor (e.g.: the "ShowErrorMessage" function), and when displaying messages in the combat window.

 

Virtual Functions

 

We thought we'd need clever tricks in order to use or redefine virtual functions, but thankfully, we were wrong. However, you must declare them in the right order, or else they'll be in the wrong order in the associated virtual function table. There are only seven virtual functions in the game -- three in baseManager, three in widget, and one in resource. These have all already been exposed, so, as long as you do not mess with them this will not be a problem.

 

However, if you declare a class which defines a virtual function, then when you compile, you'll get an error, because it will generate a virtual function table which conflicts with the existing virtual function table. The error will tell you the decorated name you'll need to import, and then a simple flag in heroes2_imports.inc fixes the problem:

 

IMPORT_??_7resourceManager@@6B@ = 1

 

Destructors

 

iconWidget is a class in the original game. Suppose you give a class definition in C++. You'll need to declare as extern every method that you call on it, right?

 

There's one more important thing: if a class has a destructor in the original game, you must declare that constructor in your C++ class definition.

 

E.g.:

 

class iconWidget : public widget {
  // Insert fields here
  iconWidget(...);
  ~iconWidget(); 
  method1();   
} 

 

The code will compile if you omit the ~iconWidget(), but then the compiler will not insert calls to ~iconWidget() where they should happen, and that can break the game. 

 

 

Twin Runtimes

 

The compiler will automatically add a runtime to your code -- before any of your code runs, the application will allocate some memory, register itself with the operating system, etc. The assembly code contains contains an old version of the same runtime. Sometimes, this will cause problems. If you open a file using the new version of _open, but the game tries to read from it using the old version of __read, the game will crash. We can fix this by simply demanding the assembly use the new version of __read:

 

IMPORT___read = 1

 

We currently do this for a variety of IO functions, as well as for new and delete. Hopefully, we won't have a problem from this again.

 

Name Conflicts

 

Similar to the above, sometimes both runtimes will have functions with the same name, and you want to keep both of them around. For example, both Ironfist and the original game have a WinMain function to initialize things. You can remove the name conflict by issuing a directive in heroes2_imports.inc to rename the assembly version. The "EQU" directive does exactly that.

 

_WinMain@16 EQU  _WinMain_asm@16

 

Again, you will likely never encounter this problem.

 

Debugging

 

One problem with interacting with assembly is that you may need to debug the program at the assembly level. If you're having trouble, feel free to ask for help -- one of the project goals is that virtually none of the team should need to touch assembly (in a nontrivial way). However, most of the time when I personally have run the mod in an assembly debugger, I found that the problem was actually in my C++ code, and that I would have found the problem faster by using the C++ debugger in Visual Studio. You should thus first try to debug the program using more mundane means.

 

When the program crashes, even when the error occurs in assembly, very often just knowing the name of the function giving the error is enough to determine where the problem is. Visual Studio contains a primitive assembly-level debugger, enough to determine that.

 

I recommend using the "Attach to process" feature of the Visual Studio debugger for starting the debugger.

 

Also note that Heroes II has build in functionality to emit log messages, via the "LogStr" function and friends. Run the game using the "/P9" option to raise the debug level. Then it will write output to KB.txt.

 

Other Useful Knowledge

 

Bitfields

 

It's useful to understand bitfields to work with the Ironfist code. This is particularly true for code that deals with the adventure map.

 

The original HoMM II code contains many bitfields, variables with an unusual number of bits such as 1 bit or 13 bits. IDA cannot handle these properly, and so any decompiled code that touches them is full of bit-ops.

 

For example, consider the followig field of mapCell:

 

field_4_1_1_isShadow_1_13_extraInfo

 

This states that field_4, a 16-bit field, is actually subdivided into 1 bit representing something unknown, 1 bit representing whether the tile is a shadow, 1 more bit of unknown, followed by the 13-bit "extraInfo" field.

 

This would correspond to the following declaration in C:

 

unsigned int field_4_1 : 1;

unsigned int isShadow : 1;

unsigned int field_4_3 : 1;

unsigned int extraInfo : 13;

 

Suppose you see the decompiled code:

 

cell->field_4_1_1_isShadow_1_13_extraInfo = (cell->field_4_1_1_isShadow_1_13_extraInfo & 0x7) | ((x & 0x1FFF) << 3);

 

This corresponds to the cleaned-up C code:

 

cell->extraInfo = x; 

 

To see how this is so, think through the bits of that long line. The RHS is constructing a 16-bit int whose bottom 3 bits are the same as the current field, and whose top 13 bits are x.

 

Due to a bug in the Hex-Rays decompiler, an access of extraInfo will often decompile to something like (field_4_1_1_isShadow_1_13_extraInfo >> 8) >> -5, which shows that it's extracting a 13-bit value, but is not actually a valid bit shift. This bug was fixed by our request in one version of Hex-Rays, and then reappeared a little later. Just learn to mentally translate these nonsensical bit-shifts into the correct thing.

 

 

Comments (0)

You don't have permission to comment on this page.