Reverse Engineering Fun

Recently I read a lot about reverse engineering –  in relation to malware research. I for one simply wanted to get ancient and hardly documented HVAC engineering software to work.

The software in question should have shown a photo of the front panel of a device – knobs and displays – augmented with current system’s data, and you could have played with settings to ‘simulate’ the control unit’s behavior.

I tested it on several machines, to rule out some typical issues quickly: Will in run on Windows 7? Will it run on a 32bit system? Do I need to run it was Administrator? None of that helped. I actually saw the application’s user interface coming up once, on the Win 7 32bit test machine I had not started in a while. But I could not reproduce the correct start-up, and in all other attempts on all other machines I just encountered an error message … that used an Asian character set.

I poked around the files and folders the application uses. There were some .xls and .xml files, and most text was in the foreign character set. The Asian error message was a generic Windows dialogue box: You cannot select the text within it directly, but the whole contents of such error messages can be copied using Ctrl+C. Pasting it into Google Translate it told me:

Failed to read the XY device data file

Checking the files again, there was an on xydevice.xls file, and I wondered if the relative path from exe to xls did not work, or if it was an issue with permissions. The latter was hard to believe, given that I simply copied the whole bunch of files, my user having the same (full) permissions on all of them.

I started Microsoft Sysinternals Process Monitor to check if the application was groping in vain for the file. It found the file just fine in the right location:

Immediately before accessing the file, the application looped through registry entries for Microsoft JET database drivers for Office files – the last one it probed was msexcl40.dll – a  database driver for accessing Excel files.

There is no obvious error in this dump: The xls file was closed before the Windows error popup was brought up; so the application had handled the error somehow.

I had been tinkering a lot myself with database drivers for Excel spreadsheets, Access databases, and even text files – so that looked like a familiar engineering software hack to me 🙂 On start-up the application created a bunch of XML files – I saw them once, right after I saw the GUI once in that non-reproducible test. As far as I could decipher the content in the foreign language, the entries were taken from that problematic xls file which contained a formatted table. It seemed that the application was using a sheet in the xls file as a database table.

What went wrong? I started Windows debugger WinDbg (part of the Debugging tools for Windows). I tried to go the next unhandled or handled exception, and I saw again that it stumbled over msexec40.dll:

But here was finally a complete and googleable error message in nerd speak:

Unexpected error from external database driver (1).

This sounded generic and I was not very optimistic. But this recent Microsoft article was one of the few mentioning the specific error message – an overview of operating system updates and fixes, dated October 2017. It describes exactly the observed issue with using the JET database driver to access an xls file:

Finally my curious observation of the non-reproducible single successful test made sense: When I started the exe on the Win 7 test client, this computer had been started the first time after ~3 months; it was old and slow, and it was just processing Windows Updates – so at the first run the software had worked because the deadly Windows Update had not been applied yet.

Also the ‘2007 timeframe’ mentioned was consistent – as all the application’s executable files were nearly 10 years old. The recommended strategy is to use a more modern version of the database driver, but Microsoft also states they will fix it again in a future version.

So I did not get the software to to run, as I obviously cannot fix somebody else’s compiled code – but I could provide the exact information needed by the developer to repair it.

But the key message in this post is that it was simply a lot of fun to track this down 🙂

Anniversary 4 (4 Me): “Life Ends Despite Increasing Energy”

I published my first post on this blog on March 24, 2012. Back then its title and tagline were:

Theory and Practice of Trying to Combine Just Anything
Physics versus engineering
off-the-wall geek humor versus existential questions
IT versus the real thing
corporate world’s strangeness versus small business entrepreneur’s microcosmos knowledge worker’s connectedness versus striving for independence

… which became

Theory and Practice of Trying to Combine Just Anything
I mean it

… which became

elkemental Force
Research Notes on Energy, Software, Life, the Universe, and Everything

last November. It seems I have run out of philosophical ideas and said anything I had to say about Life and Work and Culture. Now it’s not Big Ideas that make me publish a new post but my small Big Data. Recent posts on measurement data analysis or on the differential equation of heat transport  are typical for my new editorial policy.

Cartoonist Scott Adams (of Dilbert fame) encourages to look for patterns in one’s life, rather than to interpret and theorize – and to be fooled by biases and fallacies. Following this advice and my new policy, I celebrate my 4th blogging anniversary by crunching this blog’s numbers.

No, this does not mean I will show off the humbling statistics of views provided by WordPress 🙂 I am rather interested in my own evolution as a blogger. Having raked my virtual Zen garden two years ago I have manually maintained lists of posts in each main category – these are my menu pages. Now I have processed each page’s HTML code automatically to count posts published per month, quarter, or year in each category. All figures in this post are based on all posts excluding reblogs and the current post.

Since I assigned two categories to some posts, I had to pick one primary category to make the height of one column reflect the total posts per month:Statistics on blog postings: Posts per month in each main category

It seems I had too much time in May 2013. Perhaps I needed creative compensation – indulging in Poetry and pop culture (Web), and – as back then I was writing a master thesis.

I had never missed a single month, but there were two summer breaks in 2012 and 2013 with only 1 post per month. It seems Life and Web gradually have been replaced by Energy, and there was a flash of IT in 2014 which I correlate with both nostalgia but also a professional flashback owing to lots of cryptography-induced deadlines.

But I find it hard to see a trend, and I am not sure about the distortion I made by picking one category.

So I rather group by quarter:

Statistics on blog postings: Posts per quarter in each main category

… which shows that posts per quarter have reached a low right now in Q1 2016, even when I would add the current posting. Most posts now are based on original calculations or data analysis which take more time to create than search term poetry or my autobiographical vignettes. But maybe my anecdotes and opinionated posts had just been easy to write as I was drawing on ‘content’ I had in mind for years before 2012.

In order to spot my ‘paradigm shifts’ I include duplicates in the next diagram: Each post assigned to two categories is counted twice. Since then the total number does not make sense I just depict relative category counts per quarter:

Statistics on blog postings: Posts per quarter in each category, including the assignment of more than one category.

Ultimate wisdom: Life ends, although Energy is increasing. IT is increasing, too, and was just hidden in the other diagram: Recently it is  often the secondary category in posts about energy systems’ data logging. Physics follows an erratic pattern. Quantum Field Theory was accountable for the maximum at the end of 2013, but then replaced by thermodynamics.

Web is also somewhat constant, but the list of posts shows that the most recent Web posts are on average more technical and less about Web and Culture and Everything. There are exceptions.

Those trends are also visible in yearly overviews. The Decline Of Web seems to be more pronounced – so I tag this post with Web.

Statistics on blog postings: Posts per year in each main category

Statistics on blog postings: Posts per year in each category, including the assignment of more than one category.

But perhaps I was cheating. Each category was not as stable as the labels in the diagrams’ legends do imply.

Shortcut categories refer to
1) these category pages: EnergyITLifePhysicsPoetryWeb,
2) and these categories EnergyITLifePhysicsPoetryWeb, respectively, manually kept in sync.

So somehow…

public-key-infrastructure became control-and-it

and

on-writing-blogging-and-indulging-in-web-culture is now simply web

… and should maybe be called nerdy-web-stuff-and-software-development.

In summary, I like my statistics as it confirms my hunches but there is one exception: There was no Poetry in Q1 2016 and I have to do something about this!

________________________________

The Making Of

  • Copy the HTML content of each page with a list to a text editor (I use Notepad2).
  • Find double line breaks (\r\n\r\n) and replace them by a single one (\r\n).
  • Copy the lines to an application that lets you manipulate strings (I use Excel).
  • Tweak strings with formulas / command to cut out date, url, title and comment. Use the HTML tags as markers.
  • Batch-add the page’s category in a new column.
  • Indicate if this is the primary or secondary category in a new column (Find duplicates automatically before so 1 can be assigned automatically to most posts.).
  • Group the list by month, quarter, and year respectively and add the counts to new data tables that will be used for diagrams (e.g. Excel function COUNTIFs, using only the category or category name  + indicator for the primary category as criteria).

It could be automated even better – without having to maintain category pages by simply using the category feeds (like this: https://elkement.wordpress.com/category/physics/feed) or by filtering the full blog feed for categories. I have re-categorized all my posts so that categories matches menu page lists, but I chose to use my lists as

  1. I get not only date and headline, but also my own additional summary / comment that’s not part of the feed. For our German blog, I actually do this in reverse: I create the HTML code of a a sitemap-style overview page on wordpress.com from an Excel list of all posts plus custom comments and then copy the auto-generated code to the HTML view of the respective menu page on the blog.
  2. the feed provided by WordPress.com can have 150 items maximum no matter which higher number you try to configure. So you need to start analyzing before you have published 150 posts.
  3. I can never resist to create a tool that manipulates text files and automates something, however weird.