A Color Box. Lost in Translation

It was that time again.

The Chief Engineer had rebuilt the technical room from scratch. Each piece of heavy equipment had a new place, each pipe and wire was reborn in a new incarnation (German stories here.)

The control system was turned upset down as well, and thus the Data Kraken was looking at its entangled tentacles, utterly confused. The fabric of spacetime was broken again – the Kraken was painfully reminded of its last mutation in 2016.

Back then a firmware update had changed the structure of log files exported by Winsol. Now these changes were home-made. Sensors have been added. Sensor values have been added to the logging setup. Known values are written to different places in the log file. The log has more columns. The electrical input power of the heat pump has now a positive value, finally. Energy meters have been reset during the rebuild. More than once. And on and on.

But Data Kraken had provided for such disruptions! In a classical end-of-calendar-year death march project its software architecture had been overturned 2016. Here is a highly illustrative ‘executive level’ diagram:

The powerful SQL Server Kraken got a little companion – the Proto Kraken. Proto Kraken proudly runs on Microsoft Access. It comprises the blueprint of Big Kraken – its DNA: a documentation of all measured values, and their eternal history: When was the sensor installed, at which position in the log file do you find its values from day X to day Y, when was it retired, do the values need corrections …

A Powershell-powered tentacle crafts the Big Kraken from this proto version. It’s like scientists growing a mammoth from fossils: The database can be rebuilt from the original log files, no matter how the file structure mutates over time.

A different tentacle embraces the actual functions needed for data analysis – which is Data Kraken’s true calling. Kraken consolidates data from different loggers, and it has to do more than just calculating max / min / totals / averages. For example, the calculation of the heat pump’s performance factor had to mutate over time. Originally energy values had been read off manually from displays, but then the related meters were automated. Different tentacles need to reach out into different tables at different points of time.

Most ‘averages’ only make sense under certain conditions: The temperature at different points in the brine circuits should only contribute to an average temperature when the brine circulation pump is active. If you calculate the performance factor from heat source and target temperature (using a fit function), only time intervals may contribute when the heat pump did actually run.

I live in fear of Kraken’s artificial intelligence – will it ever gain consciousness? Will I wake up once in a science fiction dystopia? Fortunately, there is one remaining stumbling block: We have not yet fully automated genetic engineering. How could that ever work? A robot or a drone trying to follow the Chief Engineer’s tinkering with sensor wiring … and converting this video stream into standardized change alerts sent to Data Kraken?

After several paragraphs laden with silly metaphors, I finally come to the actual metaphor in the title of this post. The

Color Box

Once you came up with a code name for your project, you cannot get it out of your head. That also happened to the Color Box (Farbenkastl).

Here, tacky tasteless multi-colored things are called a color box. Clothes and interior design for example. Or the mixture of political parties in parliament. That’s probably rather boring, but the Austrian-German term Farbenkastl has a colorful history: It had been used in times of the monarchy to mock the overly complex system of color codes applied to the uniforms of the military.

What a metaphor for a truly imperial tool: As a precursor to the precursor to the Kraken Database … I use the Color Box! Brought to me by Microsoft Excel! I can combine my artistic streak, coloring categories of sensors and their mutations. Excel formulas spawn SQL code.

The antediluvian 2016 color box was boring:

But trying to display the 2018 color box I am hitting the limit of Excel’s zooming abilities:

I am now waiting now for the usual surprise nomination for an Science & Arts award. In the meantime, my Kraken enjoys its new toys. Again, the metaphoric power of this video is lost in translation as in German ‘Krake’ means octopus.

(We are still working at automating PVC piping via the Data Kraken, using 3D printing.)

Hacking

I am joining the ranks of self-proclaimed productivity experts: Do you feel distracted by social media? Do you feel that too much scrolling feeds transforms your mind – in a bad way? Solution: Go find an online platform that will put your mind in a different state. Go hacking on hackthebox.eu.

I have been hacking boxes over there for quite a while – and obsessively. I really wonder why I did not try to attack something much earlier. It’s funny as I have been into IT security for a long time – ‘infosec’ as it seems to be called now – but I was always a member of the Blue Team, a defender: Hardening Windows servers, building Public Key Infrastructures, always learning about attack vectors … but never really testing them extensively myself.

Earlier this year I was investigating the security of some things. They were black-boxes to me, and I figured I need to learn about some offensive tools finally – so I setup a Kali Linux machine. Then I searched for the best way to learn about these tools, I read articles and books about pentesting. But I had no idea if these ‘things’ were vulnerable at all, and where to start. So I figured: Maybe it is better to attack something made vulnerable intentionally? There are vulnerable web applications, and you can download vulnerable virtual machines … but then I remembered I saw posts about hackthebox some months ago:

As an individual, you can complete a simple challenge to prove your skills and then create an account, allowing you neto connect to our private network (HTB Labs) where several machines await for you to hack them.

Back then I had figured I will not pass this entry challenge nor hack any of these machines. It turned out otherwise, and it has been a very interesting experience so far -to learn about pentesting tools and methods on-the-fly. It has all been new, yet familiar in some sense.

Once I had been a so-called expert for certain technologies or products. But very often I became that expert by effectively reverse engineering the product a few days before I showed off that expertise. I had the exact same mindset and methods that are needed to attack the vulnerable applications of these boxes. I believe that in today’s world of interconnected systems, rapid technological change, [more buzz words here] every ‘subject matter expert’ is often actually reverse engineering – rather than applying knowledge acquired by proper training. I had certifications, too – but typically I never attended a course, but just took the exam after I had learned on the job.

On a few boxes I could use in-depth knowledge about protocols and technologies I had  long-term experience with, especially Active Directory and Kerberos. However, I did not find those boxes easier to own than the e.g. Linux boxes where everything was new to me. With Windows boxes I focussed too much on things I knew, and overlooked the obvious. On Linux I was just a humble learner – and it seemed this made me find the vulnerability or misconfiguration faster.

I felt like time-travelling back to when I started ‘in IT’, back in the late 1990s. Now I can hardly believe that I went directly from staff scientist in a national research center to down-to-earth freelance IT consultant – supporting small businesses. With hindsight, I knew so little both about business and about how IT / Windows / computers are actually used in the real world. I tried out things, I reverse engineered, I was humbled by what remains to be learned. But on the other hand, I was delighted by how many real-live problems – for whose solution people were eager to pay – can be solved pragmatically by knowing only 80%. Writing academic papers had felt more like aiming at 130% all of the time – but before you have to beg governmental entities to pay for it. Some academic colleagues were upset by my transition to the dark side, but I never saw this chasm: Experimental physics was about reverse engineering natural black-boxes – and sometimes about reverse engineering your predecessors enigmatic code. IT troubleshooting was about reverse engineering software. Theoretically it is all about logic and just zero’s and one’s, and you should be able to track down the developer who can explain that weird behavior. But in practice, as a freshly minted consultant without any ‘network’ you can hardly track down that developer in Redmond – so you make educated guesses and poke around the system.

I also noted eerie coincidences: In the months before being sucked into hackthebox’ back-hole, I had been catching up on Python, C/C++, and Powershell – for productive purposes, for building something. But all of that is very useful now, for using or modifying exploits. In addition I realize that my typical console applications for simulations and data analysis are quite similar ‘in spirit’ to typical exploitation tools. Last year I also learned about design patterns and best practices in object-oriented software development – and I was about to over-do it. Maybe it’s good to throw in some Cowboy Coding for good measure!

But above all, hacking boxes is simply addictive in a way that cannot be fully explained. It is like reading novels about mysteries and secret passages. Maybe this is what computer games are to some people. Some commentators say that machines on pentesting platforms are are more Capture-the-Flag-like (CTF) rather than real-world pentesting. It is true that some challenges have a ‘story line’ that takes you from one solved puzzle to the next one. To some extent a part of the challenge has to be fabricated as there are no real users to social engineer. But there are very real-world machines on hackthebox, e.g. requiring you to escalate one one object in a Windows domain to another.

And if you ever have seen what stuff is stored in clear text in the real world, or what passwords might be used ‘just for testing’ (and never changed) – then also the artificial guess-the-password challenges do not appear that unrealistic. I want to emphasize that I am not the one to make fun of weak test passwords and the like at all. More often than not I was the one whose job was to get something working / working again, under pressure. Sometimes it is not exactly easy to ‘get it working’ quickly, in an emergency, and at the same time considering all security implications of the ‘fix’ you have just applied – by thinking like an attacker. hackthebox is an excellent platform to learn that, so I cannot recommend it enough!

An article about hacking is not complete if it lacks a clichéd stock photo! I am searching for proper hacker’s attire now – this was my first find!

Cloudy Troubleshooting (2)

Unrelated to part 1 – but the same genre.

Actors this time:

  • File Cloud: A cloud service for syncing and sharing files. We won’t drop a brand name, will we?
  • Client: Another user of File Cloud.
  • [Redacted]: Once known for reliability and as The Best Network.
  • Dark Platform: Wannabe hackers’ playground.
  • elkement: Somebody who sometimes just wants to be an end user, but always ends up sniffing and debugging.

There are no dialogues with human life-forms this time, only the elkement’s stream of consciousness, interacting with the others via looking at things at a screen.

elkement: Time for a challenging Sunday hack!

elkement connects to the The Dark Platform. Hardly notices anything in the real world anymore. But suddenly elkement looks at the clock – and at File Cloud’s icon next to it.

elkement: File Cloud, what’s going on?? Seems you have a hard time Connecting… for hours now? You have not even synced my hacker notes from yesterday evening?

elkement tries to avoid to look at File Cloud, but it gets too painful.

elkement: OK – let’s consider the File Cloud problem the real Sunday hacker’s challenge…

elkement walks through the imaginary checklist:

  • File Cloud mentioned on DownDetector website? No.
  • Users tweeting about outage? No.
  • Do the other cloudy apps work fine? Yes.
  • Do other web sites work fine? Yes.
  • Does my router needs its regular reboots because it’s DNS server got stuck? No.
  • Should I perhaps try the usual helpdesk recommendation? Yes. (*)

(*) elkement turns router and firewall off and on again. Does not help.

elkement gets worried about Client using File Cloud, too. Connects to Client’s network – via another cloudy app (that obviously also works).

  • Does Client has the same issues? Yes and No – Yes at one site, No at another site.

elkement: Oh no – do I have to setup a multi-dimensional test matrix again to check for weird dependencies?

Coffee Break. Leaving the hacker’s cave. Gardening.

elkement: OK, let’s try something new!

elkement connects to super shaky mobile internet via USB tethering on the smart phone.

  • Does an alternative internet connection fix File Cloud? Yes!!

elkement: Huh!? Will now again somebody explain to me that a protocol (File Cloud) is particularly sensitive to hardly noticeable network disconnects? Is it maybe really a problem with [Redacted] this time?

elkement checks out DownDetector – and there they are the angry users and red spots on the map. They mention that seemingly random websites and applications fail. And that [Redacted] is losing packets.

elkement: Really? Only packets for File Cloud?

elkement starts sniffing. Checks IP addresses.

(elkement: Great, whois does still work, despite the anticipated issues with GDPR!)

elkement spots communication with File Cloud. File Cloud client and server are stuck in a loop of misunderstandings. File Cloud client is rude and says: RST, then starts again. Says Hello. They never shake hands as a previous segment was not captured.

elkement: But why does all the other stuff work??

elkement googles harder. Indeed, some other sites might be slower – not The Dark Platform, fortunately. Now finally Google and duckduckgo stop working, too. 

elkement: I can’t hack without Google.

elkement hacks something without Google though. Managed to ignore File Cloud’s heartbreaking connection attempts.

A few hours later it’s over. File Cloud syncs hacker notes. Red spots on DownDetector start to fade out while the summer sun is setting.

~

FIN, ACK

Where Are the Files? [Winsol – UVR16x2]

Recently somebody has asked me where the log files are stored. This question is more interesting then it seems.

We are using the freely programmable controller UVR16x2 (and its predecessor) UVR1611) …

.. and their Control and Monitoring Interface – CMI:The CMI is a data logger and runs a web server. It logs data from the controllers (and other devices) via CAN bus – I have demonstrated this in a contrived example recently, and described the whole setup in this older post.

IT / smart home nerds asked me why there are two ‘boxes’ as other solutions only use a ‘single box’ as both controller and logger. I believe separating these functions is safer and more secure: A logger / web server should not be vital to run the controller, and any issues with these auxiliary components must impact the controller’s core functions.

Log files are stored on the CMI in a proprietary format, and they can retrieved via HTTP using the software Winsol. Winsol lets you visualize data for 1 or more days, zoom in, define views etc. – and data can be exported as CSV files. This is the tool we use for reverse engineering hydraulics and control logic (German blog post about remote hydraulics surgery):

In the latest versions of Winsol, log files are per default stored in the user’s profile on Windows:
C:\Users\[Username]\Documents\Technische Alternative\Winsol

I had never paid much attention to this; I had always changed that path in the configuration to make backup and automation easier. The current question about the log files’ location was actually about how I managed to make different users work with the same log files.

The answer might not be obvious because of the historical location of the log files:

Until some version of Winsol in use in 2017 log files were by stored in the Program Files folder, or at least Winsol tried to use that folder. Windows does not allow this anymore for security reasons.

If Winsol is upgraded from an older version, settings might be preserved. I did my tests  with Winsol 2.07 upgraded from an earlier version. I am a bit vague about versions as I did not test different upgrade paths in detail My point is users of control system’s software tend to be conservative when it comes to changing a running system – an older ‘logging PC’ with an older or upgraded version of Winsol is not an unlikely setup.

I started debugging on Windows 10 with the new security feature Controlled Folder Access enabled. CFA, of course, did not know Winsol, considered it an unfriendly app … to be white-listed.

Then I was curious about the default log file folders, and I saw this:

In the Winsol file picker dialogue (to the right) the log folders seem to be in the Program Files folder:
C:\Program Files\Technische Alternative\Winsol\LogX
But in Windows Explorer (to the left) there are no log files at that location.

What does Microsoft Sysinternals Process Monitor say?

There is a Reparse Point, and the file access is redirected to the folder:
C:\Users\[User]\AppData\Local\VirtualStore\Program Files\Technische Alternative\Winsol
Selecting this folder directly in Windows Explorer shows the missing files:

This location can be re-configured in Winsol to allow different users to access the same files (Disclaimer: Perhaps unsupported by the vendor…)

And there are also some truly user-specific configuration files in the user’s profile, in
C:\Users\[User]\AppData\Roaming\Technische Alternative\Winsol

Winsol.xml is e.g. for storing the list of ‘clients’ (logging profiles) that are included in automated processing of log files, and cookie.txt is the logon cookie for access to the online logging portal provided by Technische Alternative. If you absolutely want to switch Windows users *and* switch logging profiles often *and* sync those you have to tinker with Winsol.xml, e.g. by editing it using a script (Disclaimer again: Unlikely to be a supported way of doing things ;-))

As a summary, I describe the steps required to migrate Winsol’s configuration to a new PC and prepare it for usage by different users.

  • Install the latest version of Winsol on the target PC.
  • If you use Controlled Folder Access on Windows 10: Exempt Winsol as a friendly app.
  • Copy the contents of C:\Users\[User]\AppData\Roaming\Technische Alternative\Winsol from the user’s profile on the old machine to the new machine (user-specific config files).
  • If the log file folder shows up at a different path on the two machines – for example when using the same folder via a network share – edit the path in Winsol.xml or configure it in General Settings in Winsol.
  • Copy your existing log data to this new path. LogX contains the main log files, Infosol contain clients’ data. The logging configuration for each client, e.g. the IP address or portal name of the logger, is included in the setup.xml file in the root of each client’s folder.

Note: If you skip some Winsol versions on migrating/upgrading the structure of files might have changed – be careful! Last time that happened by the end of 2016 and Data Kraken had to re-configure some tentacles.

Cloudy Troubleshooting

Actors:

  • Cloud: Service provider delivering an application over the internet.
  • Client: Business using the Cloud
  • Telco: Service provider operating part of the network infrastructure connecting them.
  • elkement: Somebody who always ends up playing intermediary.

~

Client: Cloud logs us off ever so often! We can’t work like this!

elkement: Cloud, what timeouts do you use? Client was only idle for a short break and is logged off.

Cloud: Must be something about your infrastructure – we set the timeout to 1 hour.

Client: It’s becoming worse – Cloud logs us off every few minutes even we are in the middle of working.

[elkement does a quick test. Yes, it is true.]

elkement: Cloud, what’s going on? Any known issue?

Cloud: No issue in our side. We have thousands of happy clients online. If we’d have issues, our inboxes would be on fire.

[elkement does more tests. Different computers at Client. Different logon users. Different Client offices. Different speeds of internet connections. Computers at elkement office.]

elkement: It is difficult to reproduce. It seems like it works well for some computers or some locations for some time. But Cloud – we did not have any issues of that kind in the last year. This year the troubles started.

Cloud: The timing of our app is sensitive: If network cards in your computers turn on power saving that might appear as a disconnect to us.

[elkement learns what she never wanted to know about various power saving settings. To no avail.]

Cloud: What about your bandwidth?… Well, that’s really slow. If all people in the office are using that connection we can totally understand why our app sees your users disappearing.

[elkement on a warpath: Tracking down each application eating bandwidth. Learning what she never wanted to know about tuning the background apps, tracking down processes.]

elkement: Cloud, I’ve throttled everything. I am the only person using Clients’ computers late at night, and I still encounter these issues.

Cloud: Upgrade the internet connection! Our protocol might choke on a hardly noticeable outage.

[elkement has to agree. The late-night tests were done over a remote connections; so measurement may impact results, as in quantum physics.]

Client: Telco, we buy more internet!

[Telco installs more internet, elkement measures speed. Yeah, fast!]

Client: Nothing has changed, Clouds still kicks us out every few minutes.

elkement: Cloud, I need to badger you again….

Cloud: Check the power saving settings of your firewalls, switches, routers. Again, you are the only one reporting such problems.

[The router is a blackbox operated by Telco]

elkement: Telco, does the router use any power saving features? Could you turn that off?

Telco: No we don’t use any power saving at all.

[elkement dreams up conspiracy theories: Sometimes performance seems to degrade after business hours. Cloud running backup jobs? Telco’s lines clogged by private users streaming movies? But sometimes it’s working well even in the location with the crappiest internet connection.]

elkement: Telco, we see this weird issue. It’s either Cloud, Client’s infrastructure, or anything in between, e.g. you. Any known issues?

Telco: No, but [proposal of test that would be difficult to do]. Or send us a Wireshark trace.

elkement: … which is what I planned to do anyway…

[elkement on a warpath 2: Sniffing, tracing every process. Turning off all background stuff. Looking at every packet in the trace. Getting to the level where there are no other packets in between the stream of messages between Client’s computers and Cloud’s servers.]

elkement: Cloud, I tracked it down. This is not a timeout. Look at the trace: Server and client communicating nicely, textbook three-way handshake, server says FIN! And no other packet in the way!

Cloud: Try to connect to a specific server of us.

[elkement: Conspiracy theory about load balancers]

elkement: No – erratic as ever. Sometimes we are logged off, sometimes it works with crappy internet. Note that Client could work during vacation last summer with supper shaky wireless connections.

[Lots of small changes and tests by elkement and Cloud. No solution yet, but the collaboration is seamless. No politics and finger-pointing who to blame – just work. The thing that keeps you happy as a netadmin / sysadmin in stressful times.]

elkement: Client, there is another interface which has less features. I am going to test it…

[elkement: Conspiracy theory about protocols. More night-time testing].

elkement: Client, Other Interface has the same problems.

[elkement on a warpath 3: Testing again with all possible combinations of computers, clients, locations, internet connections. Suddenly a pattern emerges…]

elkement: I see something!! Cloud, I believe it’s user-dependent. Users X and Y are logged off all the time while A and B aren’t.

[elkement scratches head: Why was this so difficult to see? Tests were not that unambiguous until now!]

Cloud: We’ve created a replacement user – please test.

elkement: Yes – New User works reliably all the time! 🙂

Client: It works –  we are not thrown off in the middle of work anymore!

Cloud: Seems that something about the user on our servers is broken – never happened before…

elkement: But wait 😦 it’s not totally OK: Now logged off after 15 minutes of inactivity? But never mind – at least not as bad as logged off every 2 minutes in the middle of some work.

Cloud: Yeah, that could happen – an issue with Add-On Product. But only if your app looks idle to our servers!

elkement: But didn’t you tell us that every timeout ever is no less than 1 hour?

Cloud: No – that 1 hour was another timeout …

elkement: Wow – classic misunderstanding! That’s why it is was so difficult to spot the pattern. So we had two completely different problems, but both looked like unwanted logoffs after a brief period, and at the beginning both weren’t totally reproducible.

[elkement’s theory validated again: If anything qualifies elkement for such stuff at all it was experience in the applied physics lab – tracking down the impact of temperature, pressure and 1000 other parameters on the electrical properties of superconductors… and trying to tell artifacts from reproducible behavior.]

~

Cloudy

Let Your Hyperlinks Live Forever!

It is the the duty of a Webmaster to allocate URIs which you will be able to stand by in 2 years, in 20 years, in 200 years. This needs thought, and organization, and commitment. (https://www.w3.org/Provider/Style/URI)

Joel Spolsky did it:

 I’m bending over backwards not to create “linkrot” — all old links to Joel on Software stories have been replaced with redirects, so they should still work. (November 2001)

More than once:

I owe a huge debt of gratitude to [several people] for weeks of hard work on creating this almost perfect port of 16 years of cruft, preserving over 1000 links with redirects… (December 2016).

Most of the outgoing URLs linked by Joel of Software have rotted, with some notable exceptions: Jakob Nielsen’s URLs do still work, so they live what he preached – in 1998:

… linkrot contributes to dissolving the very fabric of the Web: there is a looming danger that the Web will stop being an interconnected universal hypertext and turn into a set of isolated info-islands. Anything that reduces the prevalence and usefulness of cross-site linking is a direct attack on the founding principle of the Web.

No excuses if you are not Spolsky- or Nielsen-famous – I did it too, several times. In 2015 I rewrote the application for my websites from scratch and redirected every single .asp URL to a new friendly URL at a new subdomain.

I am obsessed with keeping old URLs working. I don’t like it if websites are migrated to a new content management system, changing all the URLs.

I checked all that again when migrating to HTTPS last year.

So I am a typical nitpicking dinosaur, waxing nostalgic about the time when web pages were still pages, and when Hyperlinks Subverted Hierarchy. When browsers were not yet running an OS written in Javascript and hogging 70% of your CPU for ad-tracking or crypto-mining.

The dinosaur is grumpy when it has to fix outgoing URLs on this blog. So. Many. Times. Like every second time I test a URL that shows up in my WordPress statistics as clicked, it 404s. Then I try to find equivalent content on the same site if the domain does still exist – and had not been orphaned and hijacked by malvertizers. If I am not successful I link to a version of this content on web.archive.org, track down the content owner’s new site, or find similar content elsewhere.

My heart breaks when I see that it’s specifically the interesting, unusual content that users want to follow from here – like hard-to-find historical information on how to build a heat pump from clay tablets and straw. My heart breaks even more when the technical content on the target site gets dumbed down more and more with every URL breaking website overhaul. But OK – you now have this terrific header image with a happy-people-at-work stock photo that covers all my desktop so that I have to scroll for anything, and the dumbed down content is shown in boxes that pop up and whirl – totally responsive, though clunky on a desktop computer.

And, yes: I totally know that site owners don’t own me anything. Just because you hosted that rare and interesting content for the last 10 years does not mean you have to do that forever.

But you marketing ninjas and website wranglers neglected an important point: We live in the age of silly gamification that makes 1990s link building pale: I like yours and you like mine. Buy Followers. Every time I read a puffed up Case Study for a project I was familiar with as an insider, I was laughing for minutes and then checked if it was not satire.

In this era of fake word-of-mouth marketing you get incoming links. People say something thoughtful, maybe even nice about you just because they found your content interesting and worth linking not because you play silly games of reciprocating. The most valuable links are set by people you don’t know and who did not anticipate you will ever notice their link. As Nassim Taleb says: Virtue is what you do when nobody is looking.

I would go to great lengths not to break links to my sites in those obscure DIY forums whose posts are hardly indexed by search engines. At least I would make a half-hearted attempt at redirecting to a custom 404 page that explains where you might the moved content. Or just keep the domain name intact. Which of course means not to register a catchy domain name for every product in the first place. Which I consider bad practice anyway – training users to fall for phishing, by getting them used to jumping from one weird but legit domain to another.

And, no, I don’t blame you personally, poor stressed out web admin who had to get the new site up and running before April 1st, because suits in your company said the world would come to an end otherwise. I just think that our internet culture that embraces natural linkrot so easily is as broken as the links.

I tag this as Rant, but it is a Plea: I beg you, I implore you to invest just a tiny part of the time, budget and efforts you allocated to Making the Experience of Your Website Better to making some attempt at keeping your URLs intact. They are actually valuable for others – something you should be proud of.

Reverse Engineering Fun

Recently I read a lot about reverse engineering –  in relation to malware research. I for one simply wanted to get ancient and hardly documented HVAC engineering software to work.

The software in question should have shown a photo of the front panel of a device – knobs and displays – augmented with current system’s data, and you could have played with settings to ‘simulate’ the control unit’s behavior.

I tested it on several machines, to rule out some typical issues quickly: Will in run on Windows 7? Will it run on a 32bit system? Do I need to run it was Administrator? None of that helped. I actually saw the application’s user interface coming up once, on the Win 7 32bit test machine I had not started in a while. But I could not reproduce the correct start-up, and in all other attempts on all other machines I just encountered an error message … that used an Asian character set.

I poked around the files and folders the application uses. There were some .xls and .xml files, and most text was in the foreign character set. The Asian error message was a generic Windows dialogue box: You cannot select the text within it directly, but the whole contents of such error messages can be copied using Ctrl+C. Pasting it into Google Translate it told me:

Failed to read the XY device data file

Checking the files again, there was an on xydevice.xls file, and I wondered if the relative path from exe to xls did not work, or if it was an issue with permissions. The latter was hard to believe, given that I simply copied the whole bunch of files, my user having the same (full) permissions on all of them.

I started Microsoft Sysinternals Process Monitor to check if the application was groping in vain for the file. It found the file just fine in the right location:

Immediately before accessing the file, the application looped through registry entries for Microsoft JET database drivers for Office files – the last one it probed was msexcl40.dll – a  database driver for accessing Excel files.

There is no obvious error in this dump: The xls file was closed before the Windows error popup was brought up; so the application had handled the error somehow.

I had been tinkering a lot myself with database drivers for Excel spreadsheets, Access databases, and even text files – so that looked like a familiar engineering software hack to me 🙂 On start-up the application created a bunch of XML files – I saw them once, right after I saw the GUI once in that non-reproducible test. As far as I could decipher the content in the foreign language, the entries were taken from that problematic xls file which contained a formatted table. It seemed that the application was using a sheet in the xls file as a database table.

What went wrong? I started Windows debugger WinDbg (part of the Debugging tools for Windows). I tried to go the next unhandled or handled exception, and I saw again that it stumbled over msexec40.dll:

But here was finally a complete and googleable error message in nerd speak:

Unexpected error from external database driver (1).

This sounded generic and I was not very optimistic. But this recent Microsoft article was one of the few mentioning the specific error message – an overview of operating system updates and fixes, dated October 2017. It describes exactly the observed issue with using the JET database driver to access an xls file:

Finally my curious observation of the non-reproducible single successful test made sense: When I started the exe on the Win 7 test client, this computer had been started the first time after ~3 months; it was old and slow, and it was just processing Windows Updates – so at the first run the software had worked because the deadly Windows Update had not been applied yet.

Also the ‘2007 timeframe’ mentioned was consistent – as all the application’s executable files were nearly 10 years old. The recommended strategy is to use a more modern version of the database driver, but Microsoft also states they will fix it again in a future version.

So I did not get the software to to run, as I obviously cannot fix somebody else’s compiled code – but I could provide the exact information needed by the developer to repair it.

But the key message in this post is that it was simply a lot of fun to track this down 🙂