# Cloudy Troubleshooting (2)

Unrelated to part 1 – but the same genre.

Actors this time:

• File Cloud: A cloud service for syncing and sharing files. We won’t drop a brand name, will we?
• Client: Another user of File Cloud.
• [Redacted]: Once known for reliability and as The Best Network.
• Dark Platform: Wannabe hackers’ playground.
• elkement: Somebody who sometimes just wants to be an end user, but always ends up sniffing and debugging.

There are no dialogues with human life-forms this time, only the elkement’s stream of consciousness, interacting with the others via looking at things at a screen.

elkement: Time for a challenging Sunday hack!

elkement connects to the The Dark Platform. Hardly notices anything in the real world anymore. But suddenly elkement looks at the clock – and at File Cloud’s icon next to it.

elkement: File Cloud, what’s going on?? Seems you have a hard time Connecting… for hours now? You have not even synced my hacker notes from yesterday evening?

elkement tries to avoid to look at File Cloud, but it gets too painful.

elkement: OK – let’s consider the File Cloud problem the real Sunday hacker’s challenge…

elkement walks through the imaginary checklist:

• File Cloud mentioned on DownDetector website? No.
• Users tweeting about outage? No.
• Do the other cloudy apps work fine? Yes.
• Do other web sites work fine? Yes.
• Does my router needs its regular reboots because it’s DNS server got stuck? No.
• Should I perhaps try the usual helpdesk recommendation? Yes. (*)

(*) elkement turns router and firewall off and on again. Does not help.

elkement gets worried about Client using File Cloud, too. Connects to Client’s network – via another cloudy app (that obviously also works).

• Does Client has the same issues? Yes and No – Yes at one site, No at another site.

elkement: Oh no – do I have to setup a multi-dimensional test matrix again to check for weird dependencies?

Coffee Break. Leaving the hacker’s cave. Gardening.

elkement: OK, let’s try something new!

elkement connects to super shaky mobile internet via USB tethering on the smart phone.

• Does an alternative internet connection fix File Cloud? Yes!!

elkement: Huh!? Will now again somebody explain to me that a protocol (File Cloud) is particularly sensitive to hardly noticeable network disconnects? Is it maybe really a problem with [Redacted] this time?

elkement checks out DownDetector for [Redacted] – and there they are the angry users and red spots on the map. They mention that seemingly random websites and applications fail. And that [Redacted] is losing packets.

elkement: Really? Only packets for File Cloud?

elkement starts sniffing. Checks IP addresses.

(elkement: Great, whois does still work, despite the anticipated issues with GDPR!)

elkement spots communication with File Cloud. File Cloud client and server are stuck in a loop of misunderstandings. File Cloud client is rude and says: RST, then starts again. Says Hello. They never shake hands as a previous segment was not captured.

elkement: But why does all the other stuff work??

elkement googles harder. Indeed, some other sites might be slower – not The Dark Platform, fortunately. Now finally Google and duckduckgo stop working, too.

elkement: I can’t hack without Google.

elkement hacks something without Google though. Managed to ignore File Cloud’s heartbreaking connection attempts.

A few hours later it’s over. File Cloud syncs hacker notes. Red spots on DownDetector start to fade out while the summer sun is setting.

~

FIN, ACK

|

# Infinite Loop: Theory and Practice Revisited.

I’ve unlocked a new achievement as a blogger, or a new milestone as a life-form. As a dinosaur telling the same old stories over and over again.

I started drafting a blog post, as I always do since a while: I do it in my mind only, twist and turn in for days or weeks – until I am ready to write it down in one go. Today I wanted to release a post called On Learning (2) or the like. I knew I had written an early post with a similar title, so I expected this to be a loosely related update. But then I checked the old On Learning post: I found not only the same general ideas but the same autobiographical anecdotes I wanted to use now – even  in the same order.

2014 I had looked back on being both a teacher and a student for the greater part of my professional life, and the patterns were always the same – be the field physics, engineering, or IT security. I had written this post after a major update of our software for analyzing measurement data. This update had required me to acquire new skills, which was a delightful learning experience. I tried to reconcile very different learning modes: ‘Book learning’ about so-called theory, including learning for the joy of learning, and solving problems hands-on based on the minimum knowledge absolutely required.

It seems I like to talk about the The Joys of Theory a lot – I have meta-posted about theoretical physics in general, more than oncegeneral relativity as an example, and about computer science. I searched for posts about hands-on learning now – there aren’t any. But every post about my own research and work chronicles this hands-on learning in a non-meta explicit way. These are the posts listed on the heat pump / engineering page,  the IT security / control page, and some of the physics posts about the calculations I used in my own simulations.

Now that I am wallowing in nostalgia and scrolling through my old posts I feel there is one possibly new insight: Whenever I used knowledge to achieve a result that I really needed to get some job done, I think about this knowledge as emerging from hands-on tinkering and from self-study. I once read that many seasoned software developers also said that in a survey about their background: They checked self-taught despite having university degrees or professional training.

This holds for the things I had learned theoretically – be it in a class room or via my morning routine of reading textbooks. I learned about differential equations, thermodynamics, numerical methods, heat pumps, and about object-oriented software development. Yet when I actually have to do all that, it is always like re-learning it again in a more pragmatic way, even if the ‘class’ was very ‘applied’, not much time had passed since learning only, and I had taken exams. This is even true for the archetype all self-studied disciplines – hacking. Doing it – like here  – white-hat-style ;-) – is always a self-learning exercise, and reading about pentesting and security happens in an alternate universe.

The difference between these learning modes is maybe not only in ‘the applied’ versus ‘the theoretical’, but it is your personal stake in the outcome that matters – Skin In The Game. A project done by a group of students for the final purpose of passing a grade is not equivalent to running this project for your client or for yourself. The point is not if the student project is done for a real-life client, or the task as such makes sense in the real world. The difference is whether it feels like an exercise in an gamified system, or whether the result will matter financially / ‘existentially’ as you might try to empress your future client or employer or use the project results to build your own business. The major difference is in weighing risks and rewards, efforts and long-term consequences. Even ‘applied hacking’ in Capture-the-Flag-like contests is different from real-life pentesting. It makes all the difference if you just loose ‘points’ and miss the ‘flag’, or if you inadvertently take down a production system and violate your contract.

So I wonder if the Joy of Theoretical Learning is to some extent due to its risk-free nature. As long as you just learn about all those super interesting things just because you want to know – it is innocent play. Only if you finally touch something in the real world and touching things has hard consequences – only then you know if you are truly ‘interested enough’.

Sorry, but I told you I will post stream-of-consciousness-style now and then :-)

I think it is OK to re-use the image of my beloved pre-1900 physics book I used in the 2014 post:

|

# Where Are the Files? [Winsol – UVR16x2]

Recently somebody has asked me where the log files are stored. This question is more interesting then it seems.

We are using the freely programmable controller UVR16x2 (and its predecessor) UVR1611) …

.. and their Control and Monitoring Interface – CMI:

The CMI is a data logger and runs a web server. It logs data from the controllers (and other devices) via CAN bus – I have demonstrated this in a contrived example recently, and described the whole setup in this older post.

IT / smart home nerds asked me why there are two ‘boxes’ as other solutions only use a ‘single box’ as both controller and logger. I believe separating these functions is safer and more secure: A logger / web server should not be vital to run the controller, and any issues with these auxiliary components must impact the controller’s core functions.

Log files are stored on the CMI in a proprietary format, and they can retrieved via HTTP using the software Winsol. Winsol lets you visualize data for 1 or more days, zoom in, define views etc. – and data can be exported as CSV files. This is the tool we use for reverse engineering hydraulics and control logic (German blog post about remote hydraulics surgery):

In the latest versions of Winsol, log files are per default stored in the user’s profile on Windows:

I had never paid much attention to this; I had always changed that path in the configuration to make backup and automation easier. The current question about the log files’ location was actually about how I managed to make different users work with the same log files.

The answer might not be obvious because of the historical location of the log files:

Until some version of Winsol in use in 2017 log files were by stored in the Program Files folder, or at least Winsol tried to use that folder. Windows does not allow this anymore for security reasons.

If Winsol is upgraded from an older version, settings might be preserved. I did my tests  with Winsol 2.07 upgraded from an earlier version. I am a bit vague about versions as I did not test different upgrade paths in detail. My point is users of control system’s software tend to be conservative when it comes to changing a running system – an older ‘logging PC’ with an older or upgraded version of Winsol is not an unlikely setup.

I started debugging on Windows 10 with the new security feature Controlled Folder Access enabled. CFA, of course, did not know Winsol, considered it an unfriendly app … to be white-listed.

Then I was curious about the default log file folders, and I saw this:

In the Winsol file picker dialogue (to the right) the log folders seem to be in the Program Files folder:
C:\Program Files\Technische Alternative\Winsol\LogX
But in Windows Explorer (to the left) there are no log files at that location.

What does Microsoft Sysinternals Process Monitor say?

There is a Reparse Point, and the file access is redirected to the folder:
C:\Users\[User]\AppData\Local\VirtualStore\Program Files\Technische Alternative\Winsol
Selecting this folder directly in Windows Explorer shows the missing files:

This location can be re-configured in Winsol to allow different users to access the same files (Disclaimer: Perhaps unsupported by the vendor…)

And there are also some truly user-specific configuration files in the user’s profile, in
C:\Users\[User]\AppData\Roaming\Technische Alternative\Winsol

Winsol.xml is e.g. for storing the list of ‘clients’ (logging profiles) that are included in automated processing of log files, and cookie.txt is the logon cookie for access to the online logging portal provided by Technische Alternative. If you absolutely want to switch Windows users *and* switch logging profiles often *and* sync those you have to tinker with Winsol.xml, e.g. by editing it using a script (Disclaimer again: Unlikely to be a supported way of doing things ;-))

As a summary, I describe the steps required to migrate Winsol’s configuration to a new PC and prepare it for usage by different users.

• If you use Controlled Folder Access on Windows 10: Exempt Winsol as a friendly app.
• Copy the contents of C:\Users\[User]\AppData\Roaming\Technische Alternative\Winsol from the user’s profile on the old machine to the new machine (user-specific config files).
• If the log file folder shows up at a different path on the two machines – for example when using the same folder via a network share – edit the path in Winsol.xml or configure it in General Settings in Winsol.
• Copy your existing log data to this new path. LogX contains the main log files, Infosol contain clients’ data. The logging configuration for each client, e.g. the IP address or portal name of the logger, is included in the setup.xml file in the root of each client’s folder.

Note: If you skip some Winsol versions on migrating/upgrading the structure of files might have changed – be careful! Last time that happened by the end of 2016 and Data Kraken had to re-configure some tentacles.

|

# Cloudy Troubleshooting

Actors:

• Cloud: Service provider delivering an application over the internet.
• Client: Business using the Cloud
• Telco: Service provider operating part of the network infrastructure connecting them.
• elkement: Somebody who always ends up playing intermediary.

~

Client: Cloud logs us off ever so often! We can’t work like this!

elkement: Cloud, what timeouts do you use? Client was only idle for a short break and is logged off.

Cloud: Must be something about your infrastructure – we set the timeout to 1 hour.

Client: It’s becoming worse – Cloud logs us off every few minutes even we are in the middle of working.

[elkement does a quick test. Yes, it is true.]

elkement: Cloud, what’s going on? Any known issue?

Cloud: No issue in our side. We have thousands of happy clients online. If we’d have issues, our inboxes would be on fire.

[elkement does more tests. Different computers at Client. Different logon users. Different Client offices. Different speeds of internet connections. Computers at elkement office.]

elkement: It is difficult to reproduce. It seems like it works well for some computers or some locations for some time. But Cloud – we did not have any issues of that kind in the last year. This year the troubles started.

Cloud: The timing of our app is sensitive: If network cards in your computers turn on power saving that might appear as a disconnect to us.

[elkement learns what she never wanted to know about various power saving settings. To no avail.]

Cloud: What about your bandwidth?… Well, that’s really slow. If all people in the office are using that connection we can totally understand why our app sees your users disappearing.

[elkement on a warpath: Tracking down each application eating bandwidth. Learning what she never wanted to know about tuning the background apps, tracking down processes.]

elkement: Cloud, I’ve throttled everything. I am the only person using Clients’ computers late at night, and I still encounter these issues.

Cloud: Upgrade the internet connection! Our protocol might choke on a hardly noticeable outage.

[elkement has to agree. The late-night tests were done over a remote connections; so measurement may impact results, as in quantum physics.]

Client: Telco, we buy more internet!

[Telco installs more internet, elkement measures speed. Yeah, fast!]

Client: Nothing has changed, Clouds still kicks us out every few minutes.

elkement: Cloud, I need to badger you again….

Cloud: Check the power saving settings of your firewalls, switches, routers. Again, you are the only one reporting such problems.

[The router is a blackbox operated by Telco]

elkement: Telco, does the router use any power saving features? Could you turn that off?

Telco: No we don’t use any power saving at all.

[elkement dreams up conspiracy theories: Sometimes performance seems to degrade after business hours. Cloud running backup jobs? Telco’s lines clogged by private users streaming movies? But sometimes it’s working well even in the location with the crappiest internet connection.]

elkement: Telco, we see this weird issue. It’s either Cloud, Client’s infrastructure, or anything in between, e.g. you. Any known issues?

Telco: No, but [proposal of test that would be difficult to do]. Or send us a Wireshark trace.

elkement: … which is what I planned to do anyway…

[elkement on a warpath 2: Sniffing, tracing every process. Turning off all background stuff. Looking at every packet in the trace. Getting to the level where there are no other packets in between the stream of messages between Client’s computers and Cloud’s servers.]

elkement: Cloud, I tracked it down. This is not a timeout. Look at the trace: Server and client communicating nicely, textbook three-way handshake, server says FIN! And no other packet in the way!

Cloud: Try to connect to a specific server of us.

elkement: No – erratic as ever. Sometimes we are logged off, sometimes it works with crappy internet. Note that Client could work during vacation last summer with supper shaky wireless connections.

[Lots of small changes and tests by elkement and Cloud. No solution yet, but the collaboration is seamless. No politics and finger-pointing who to blame – just work. The thing that keeps you happy as a netadmin / sysadmin in stressful times.]

elkement: Client, there is another interface which has less features. I am going to test it…

[elkement: Conspiracy theory about protocols. More night-time testing].

elkement: Client, Other Interface has the same problems.

[elkement on a warpath 3: Testing again with all possible combinations of computers, clients, locations, internet connections. Suddenly a pattern emerges…]

elkement: I see something!! Cloud, I believe it’s user-dependent. Users X and Y are logged off all the time while A and B aren’t.

[elkement scratches head: Why was this so difficult to see? Tests were not that unambiguous until now!]

Cloud: We’ve created a replacement user – please test.

elkement: Yes – New User works reliably all the time! :-)

Client: It works –  we are not thrown off in the middle of work anymore!

Cloud: Seems that something about the user on our servers is broken – never happened before…

elkement: But wait :-( it’s not totally OK: Now logged off after 15 minutes of inactivity? But never mind – at least not as bad as logged off every 2 minutes in the middle of some work.

Cloud: Yeah, that could happen – an issue with Add-On Product. But only if your app looks idle to our servers!

elkement: But didn’t you tell us that every timeout ever is no less than 1 hour?

Cloud: No – that 1 hour was another timeout …

elkement: Wow – classic misunderstanding! That’s why it is was so difficult to spot the pattern. So we had two completely different problems, but both looked like unwanted logoffs after a brief period, and at the beginning both weren’t totally reproducible.

[elkement’s theory validated again: If anything qualifies elkement for such stuff at all it was experience in the applied physics lab – tracking down the impact of temperature, pressure and 1000 other parameters on the electrical properties of superconductors… and trying to tell artifacts from reproducible behavior.]

~

|

# Logging Fun with UVR16x2: Photovoltaic Generator – Modbus – CAN Bus

The Data Kraken wants to grow new tentacles.

I am playing with the CMI – Control and Monitoring Interface – the logger / ‘ethernet gateway’ connected to our control units (UVR1611, UVR16x2) via CAN bus. The CMI has become a little Data Kraken itself: Inputs and outputs can be created for CAN bus and Modbus, and data from most CAN devices can also be logged via JSON.

Are these features useful to integrate the datalogger of our photovoltaics inverter – Fronius Symo 4.5-3-M? I am now logging data to an USB stick and feed the CSV files to the SQL Server Data Kraken. The USB logger’s logging interval is 5 minutes whereas Modbus TCP allows for logging every few seconds. The inverter has built-in energy management features, but it can only ‘signal’ via a relay which also requires proper wiring. Modbus TCP, on the other hand, could use the existing WLAN connection of the inverter and the control unit could do something smarter with the sensor reading of the output power.

My motivation is to test if you – as an UVR16x2 user – can re-use a logger you  already have – the CMI – as much as possible, avoiding the need to run another ‘logging server’ all the time (Also my SQL Server is for analysis, not for real-time logging). I know that there are many open source Modbus clients available and that it is easy to write a Python script.

Activate Modbus on the inverter: I prefer floating numbers to integers plus a scaling factor, and I turn off the option to make changes via Modbus:

Modbus settings, Fronius Datalogger, inverter’s local web server. 502 is the TCO standard port. The alternative to floating numbers is integers plus a varying scaling factor (SF), to be found in another register.

Check Fronius documentation of its Modbus registers: The document is currently available here (Link last checked and updated 2019-01). There are different sets of registers related to the inverter or associated with one string of PV panels:

PDF p.30, Common Inverter Model. For logging AC output power you need:  Address 40092, type of register 3 (read and hold), datatype float32 (‘corresponds’ to two 16bit integer register, thus size 2).

The address to be configured on a Modbus client is smaller than this address by 1 – so 40091 needs to be set to log AC power.

Using these configuration parameters an analog Modbus Input is added at the CMI. The signal is ‘digital’ – but in field-bus-language everything that is not a single bit – 0/1 – seems to be ‘analog’.

Modbus input at the CMI. Input value:  32bits read from the bus interpreted as an integer. Actual value: Integer part of the ‘true value’ = the 32bits interpreted as 32-bit float.

Yes, I checked the network trace ;-) as the byte order dropdown menu confused me: According to the Modbus protocol specification Big Endian is required, not an option.

Factors and data types: Only integer values are understood by CAN devices. Decimal places might be indicated by a scaling factor. The PV power value in Watt has enough significant digits; so the integer part of the float number is fine. But for current in Ampere – typically about 15A maximum – a Factor of 10 would be better. It would not have helped to select int + scaling factor at the inverter: The scaling factor would be stored in a second register, there is a different factor for every parameter, and you cannot configure another ‘scaling factor register’ per input at the CMI. Theoretically you could log the scaling factor separately and re-scale the value in a custom application – but then I would use a separate, custom logger.

In any case, if you screw this up, you see non-sensible numbers of the CAN bus: Slowly evolving positive values – like PV power on a sunny day – are displayed as wild variations of signed integers between -32000 and 32000 ;-)

Where are the ‘logged’ data? The CMI is first and foremost the data logger for the control units. The CMI does not immediately store the data from Modbus inputs in a  local ‘logging database’. All I have achieved so far is to display the value on the Settings page. The CMI can only log values from the CAN bus or DL bus. So we need an…

… Analog CAN Output at the CMI:

The CMI has the default node number 56 on the CAN bus. Other CAN devices on the bus can query it for this parameter by specifying node 56 and output no 1.

These are the devices on our CAN bus:

CAN bus displayed on the CMI’s website. UVR1611 and UVR16x2 controllers can be managed by clicking their icons – which brings up a web page that resembles the controller’s local display.

The CMI’s Logging page looks tempting – can we simply select the CMI itself as a CAN logging source – CAN 56?

Configuration of the devices the CMI logs data from, via CAN bus. CAN 1 – UVR1611, CAN 2 – UVR16x3, CAN 41 – energy meter CAN-EZ.

Nothing stops you from selecting CAN 56 in this dropdown menu, but it does not end well:

CAN error message displayed at the logger CMI when you try to configure the CMI also as a logging source.

We need a round-trip: Data needs to be sent to a supported device first – one of the controllers on the CAN bus. We need an…

Analog CAN input / network variable at the UVR16x2:

Configuration of a CAN input at the controller UVR16x2 (via CMI’s web interface to the controller).

The value of AC power is displayed as integer without scaling. Had a factor of 10 been used at the Modbus input it would be ‘corrected’ here, using the Unit called dimensionless,1.

Values received by the controller UVR16x2 over CAN bus.

Result of all this: UVR16x2 knows PV power and can use it do magic smart things when controlling the heat pump. On the other hand, CMI can log this value – in the same way it logs all other sensor readings (after an update of the logging settings in the controller’s functional data, using TAPPS).

Log files are retrieved by Winsol, the free logging software for the CMI …

Logged visualized with Winsol. Logfiles are downloaded from the CMI on the internal LAN or via Technsche Alternative’s web portal. PV power (PV.Leistung.Watt) is displayed together with global radiation on a vertical plane (GBS, at the solar/air collector for the theat pump), ambient temperature (red), temperature of solar/air collector (orange)

… or logging is configured at the web portal cmi.ta.co.at …

Configuration of logging at cmi.ta.co.at: Supported loggers are UVR1611 and UVR16x2. Values to be logged are selected from all direct inputs / outputs / functions and from CAN network inputs and outputs.

… and data can be viewed online:

Data visualized at cmi.ta.co.at. Data logged via CAN are sent from the CMI to the web portal.

Using this kind of logging for all values the inverter provides would be costly: It’s not just a column you add to a log file, but you occupy one of the limited inputs and outputs at the CMI and the controller. If you really need to know the voltage between phase 1 and 2 or apparent power you better stick with the USB file or use a separate Modbus logger like a Rasbperry Pi. This project is great and documented very well – data acqusition from a Symo inverter using Python plus a web front end.

Sending Modbus data back and forth from the CMI to UVR controllers is only worth the efforts if you need them for control, not for ‘nice-to-have’ logging.

|

# Things You Find in Your Hydraulic Schematic

Building an ice storage powered heat pump system is a DIY adventure – for a Leonardo da Vinci of plumbing, electrical engineering, carpentry, masonry, and computer technology.

But that holistic approach is already demonstrated clearly in our hydraulic schematics. Actually, here it is even more daring and bold:

There is Plutonium – Pu – everywhere in the heating circuit and the brine circuit …

I can’t tell if this is a hazard or if it boosts energy harvest. But I was not surprised – given that Doc Emmett Brown is our hero.

Maybe we see the impact of contamination already: How should I explain the mutated butterflies with three wings otherwise? After all, they are even tagged with M

Our default backup heating system is … Facebook Messenger!

So the big internet companies are already delivering heating-as-a-service-from-the-cloud!

But what the hell is the tennis ball needed for?

|

# Cooling Potential

I had an interesting discussion about the cooling potential of our heat pump system – in a climate warmer than ours.

Recently I’ve shown data for the past heating season, including also passive cooling performance:

After the heating season, tank temperature is limited to 10°C as long as possible – the collector is bypassed in the brine circuit (‘switched off’). But with the beginning of May, the tank temperature starts to rise though as the tank is heated by the surrounding ground.

Daily cooling energy hardly exceeds 20kWh, so the average cooling power is always well below 1kW. This is much lower than the design peak cooling load – the power you would need to cool the rooms to 20°C at noon on a hot in summer day (rather ~10kW for our house.)

The blue spikes are single dots for a few days, and they make the curve look more impressive than it really is: We could use about 600kWh of cooling energy – compared to about 15.000kWh for space heating. (Note that I am from Europe – I use decimal commas and thousands dots :-))

There are three ways of ‘harvesting cold’ with this system:

(1) When water in the hygienic storage tank (for domestic hot water) is heated up in summer, the heat pump extracts heat from the underground tank.

Per summer month the heat pump needs about 170kWh of input ambient energy from the cold tank – for producing an output heating energy of about 7kWh per day – 0,3kW on average for two persons, just in line with ‘standards’. This means that nearly all the passive cooling energy we used was ‘produced’ by heating hot water.

You can see the effect on the cooling power available during a hot day here (from this article on passive cooling in the hot summer of 2015)

Blue arrows indicate hot water heating time slots – for half an hour a cooling power of about 4kW was available. But for keeping the room temperature at somewhat bearable levels, it was crucial to cool ‘low-tech style’ – by opening the windows during the night (Vent)

(2) If nights in late spring and early summer are still cool, the underground tank can be cooled via the collector during the night.

In the last season we gained about ~170kWh in total in that way – only as much as by one month of hot water heating. The effect also depends on control details: If you start cooling early in the season when you ‘actually do not really need it’ you can harvest more cold because of the higher temperature difference between tank and cold air.

(3) You keep the cold or ice you ‘create’ during the heating season.

The set point tank temperature for summer  is a trade-off between saving as much cooling energy as possible and keeping the Coefficient of Performance (COP) reasonably high also in summer – when the heat sink temperature is 50°C because the heat pump only heats hot tap water.

20°C is the maximum heat source temperature allowed by the heat pump vendor. The temperature difference between 20°C and the set point of 10°C translates to about 300kWh (only) for 25m3 of water. But cold is also transferred to ground and thus the effective store of cold is larger than the tank itself.

What are the options to increase this seasonal storage of cold?

• Turning the collector off earlier. To store as much ice as possible, the collector could even be turned off while still in space heating mode – as we did during the Ice Storage Challenge 2015.
• Active cooling: The store of passive cooling energy is limited – our large tank only contains about 2.000kWh even if frozen completely; If more cooling energy is required, there has to be a cooling backup. Some brine/water heat pumps[#] have a 4-way-valve built into the refrigeration cycle, and the roles of evaporator and condenser can be reversed: The room is cooled and the tank is heated up. In contrast to passive cooling the luke-warm tank and the surrounding ground are useful. The cooling COP would be fantastic because of the low temperature difference between source and sink – it might actually be so high that you need special hydraulic precautions to limit it.

The earlier / the more often the collector is turned off to create ice for passive cooling, the worse the heating COP will be. On the other hand, the more cold you save, the more economic is cooling later:

1. Because the active cooling COP (or EER[*]) will be higher and
2. Because the total cooling COP summed over both cooling phases will be higher as no electrical input energy is needed for passive cooling – only circulation pumps.

([*] The COP is the ratio of output heating energy and electrical energy, and the EER – energy efficiency ratio – is the ratio of output cooling energy and electrical energy. Using kWh as the unit for all energies and assuming condenser and evaporator are completely ‘symmetrical’, the EER or a heat pump used ‘in reverse’ is its heating COP minus 1.)

So there would be four distinct ways / phases of running the system in a season:

1. Standard heating using collector and tank. In a warmer climate, the tank might not even be frozen yet.
2. Making ice: At end of the heating season the collector might be turned off to build up ice for passive cooling. In case of an ’emergency’ / unexpected cold spell of weather, the collector could be turned on intermittently.
3. Passive cooling: After the end of the heating season, the underground tank cools the buffer tank (via its internal heat exchanger spirals that containing cool brine) which in turn cools the heating floor loops turned ‘cooling loops’.
4. When passive cooling power is not sufficient anymore, active cooling could be turned on. The bulk volume of the buffer tank is cooled now directly with the heat pump, and waste heat is deposited in the underground tank and ground. This will also boost the underground heat sink just right to serve as the heat source again in the upcoming heating season.

In both cooling phases the collector could be turned on in colder nights to cool the tank. This will work much better in the active cooling phase – when the tank is likely to be warmer than the air in the night. Actually, night-time cooling might be the main function the collector would have in a warmer climate.

___________________________________

[#] That seems to be valid mainly/only for domestic brine-water heat pumps from North American or Chinese vendors; they offer the reversing valve as a common option. European vendors rather offer a so called Active Cooling box, which is a cabinet that can be nearly as big as the heat pump itself. It contains a bunch of valves and heat exchangers that allow for ‘externally’ swapping the connections of condenser and evaporator to heat sink and source respectively.

# Reverse Engineering Fun

Recently I read a lot about reverse engineering –  in relation to malware research. I for one simply wanted to get ancient and hardly documented HVAC engineering software to work.

The software in question should have shown a photo of the front panel of a device – knobs and displays – augmented with current system’s data, and you could have played with settings to ‘simulate’ the control unit’s behavior.

I tested it on several machines, to rule out some typical issues quickly: Will in run on Windows 7? Will it run on a 32bit system? Do I need to run it as Administrator? None of that helped. I actually saw the application’s user interface coming up once, on the Win 7 32bit test machine I had not started in a while. But I could not reproduce the correct start-up, and in all other attempts on all other machines I just encountered an error message … that used an Asian character set.

I poked around the files and folders the application uses. There were some .xls and .xml files, and most text was in the foreign character set. The Asian error message was a generic Windows dialogue box: You cannot select the text within it directly, but the whole contents of such error messages can be copied using Ctrl+C. Pasting it into Google Translate it told me:

Failed to read the XY device data file

Checking the files again, there was an on xydevice.xls file, and I wondered if the relative path from exe to xls did not work, or if it was an issue with permissions. The latter was hard to believe, given that I simply copied the whole bunch of files, my user having the same (full) permissions on all of them.

I started Microsoft Sysinternals Process Monitor to check if the application was groping in vain for the file. It found the file just fine in the right location:

Immediately before accessing the file, the application looped through registry entries for Microsoft JET database drivers for Office files – the last one it probed was msexcl40.dll – a  database driver for accessing Excel files.

There is no obvious error in this dump: The xls file was closed before the Windows error popup was brought up; so the application had handled the error somehow.

I had been tinkering a lot myself with database drivers for Excel spreadsheets, Access databases, and even text files – so that looked like a familiar engineering software hack to me :-) On start-up the application created a bunch of XML files – I saw them once, right after I saw the GUI once in that non-reproducible test. As far as I could decipher the content in the foreign language, the entries were taken from that problematic xls file which contained a formatted table. It seemed that the application was using a sheet in the xls file as a database table.

What went wrong? I started Windows debugger WinDbg (part of the Debugging tools for Windows). I tried to go the next unhandled or handled exception, and I saw again that it stumbled over msexec40.dll:

But here was finally a complete and googleable error message in nerd speak:

Unexpected error from external database driver (1).

This sounded generic and I was not very optimistic. But this recent Microsoft article was one of the few mentioning the specific error message – an overview of operating system updates and fixes, dated October 2017. It describes exactly the observed issue with using the JET database driver to access an xls file:

Finally my curious observation of the non-reproducible single successful test made sense: When I started the exe on the Win 7 test client, this computer had been started the first time after ~3 months; it was old and slow, and it was just processing Windows Updates – so at the first run the software had worked because the deadly Windows Update had not been applied yet.

Also the ‘2007 timeframe’ mentioned was consistent – as all the application’s executable files were nearly 10 years old. The recommended strategy is to use a more modern version of the database driver, but Microsoft also states they will fix it again in a future version.

So I did not get the software to to run, as I obviously cannot fix somebody else’s compiled code – but I could provide the exact information needed by the developer to repair it.

But the key message in this post is that it was simply a lot of fun to track this down :-)

|

Recently I presented the usual update of our system’s and measurement data documentation.The PDF document contains consolidated numbers for each year and month of operations:

Total output heating energy (incl. hot tap water), electrical input energy (incl. brine pump) and its ratio – the performance factor. Seasons always start at Sept.1, except the first season that started at Nov. 2011. For ‘special experiments’ that had an impact on the results see the text and the PDF linked above.

It is finally time to tackle the fundamental questions:

What is the impact of the size of the solar/air collector?

or

What is the typical output power of the collector?

In 2014 the Chief Engineer had rebuilt the collector so that you can toggle between 12m2 instead of 24m

TOP: Full collector – hydraulics as in seasons 2012, 2013. Active again since Sept. 2017. BOTTOM: Half of the collector, used in seasons 201414, 15, and 16.

Do we have data for seasons we can compare in a reasonable way – seasons that (mainly) differ by collector area?

We disregard seasons 2014 and 2016 – we had to get rid of a nearly 100 years old roof truss and only heated the ground floor with the heat pump.

Attic rebuild project – point of maximum destruction – generation of fuel for the wood stove.

Season 2014 was atypical anyway because of the Ice Storage Challenge experiment.

Then seasonal heating energy should be comparable – so we don’t consider the cold seasons 2012 and 2016.

Remaining warm seasons: 2013 – where the full collector was used – and 2015 (half collector). The whole house was heated with the heat pump; heating and energies and ambient energies were similar – and performance factors were basically identical. So we checked the numbers for the ice months Dec/Feb/Jan. Here a difference can be spotted, but it is far less dramatic than expected. For half the collector:

• Collector harvest is about 10% lower
• Performance factor is lower by about 0,2
• Brine inlet temperature for the heat pump is about 1,5K lower

The upper half of the collector is used, as indicated by hoarfrost.

It was counter-intuitive, and I scrutinized Data Kraken to check it for bugs.

But actually we forgot that we had predicted that years ago: Simulations show the trend correctly, and it suffices to do some basic theoretical calculations. You only need to know how to represent a heat exchanger’s power in two different ways:

Power is either determined by the temperature of the fluid when it enters and exits the exchanger tubes …

[1]   T_brine_outlet – T_brine_inlet * flow_rate * specific_heat

… but power can also be calculated from the heat energy flow from brine to air – over the surface area of the tubes:

[2]   delta_T_brine_air * Exchange_area * some_coefficient

Delta T is an average over the whole exchanger length (actually a logarithmic average but using an arithmetic average is good enough for typical parameters). Some_coefficient is a parameter that characterized heat transfer for area or per length of a tube, so Exchange_area * Some_coefficient could also be called the total heat transfer coefficient.

If several heat exchangers are connected in series their powers are not independent as they share common temperatures of the fluid at the intersection points:

The brine circuit connecting heat pump, collector and the underground water/ice storage tank. The three ‘interesting’ temperatures before/after the heat pump, collector and tank can be calculated from the current power of the heat pump, ambient air temperature, and tank temperature.

When the heat pump is off in ‘collector regeneration mode’ the collector and the heat exchanger in the tank necessarily transfer heat at the same power  per equation [1] – as one’s brine inlet temperature is the other one’s outlet temperature, the flow rate is the same, and also specific heat (whose temperature dependence can be ignored).

But powers can also be expressed by [2]: Each exchanger has a different area, a different heat transfer coefficient, and different mean temperature difference to the ambient medium.

So there are three equations…

• Power for each exchanger as defined by [1]
• 2 equations of type [2], one with specific parameters for collector and air, the other for the heat exchanger in the tank.

… and from those the three unknowns can be calculated: Brine inlet temperatures, brine outlet temperature, and harvesting power. All is simple and linear, it is not a big surprise that collector harvesting power is proportional temperature difference between air and tank. The warmer the air, the more you harvest.

The combination of coefficient factors is the ratio of the product of total coefficients and their sum, like: $\frac{f_1 * f_2}{f_1 + f_2}$ – the inverse of the sum of inverses.

This formula shows what one might you have guessed intuitively: If one of the factors is much bigger than the other – if one of the heat exchangers is already much ‘better’ than the others, then it does not help to make the better one even better. In the denominator, the smaller number in the sum can be neglected before and after optimization, the superior properties always cancel out, and the ‘bad’ component fully determines performance. (If one of the ‘factors’ is zero, total power is zero.) Examples for ‘bad’ exchangers: If the heat exchanger tubes in the tank are much too short or if a flat plat collector is used instead of an unglazed collector.

On the other hand, if you make a formerly ‘worse’ exchanger much better, the ratio will change significantly. If both exchangers have properties of the same order of magnitude – which is what we deign our systems for – optimizing one will change things for the better, but never linearly, as effects always cancel out to some extent (You increase numbers in both parts if the fraction).

So there is no ‘rated performance’ in kW or kW per area you could attach to a collector. Its effective performance also depends on the properties of the heat exchanger in the tank.

But there is a subtle consequence to consider: The smaller collector can deliver the same energy and thus ‘has’ twice the power per area. However, air temperature is given, and [2] must hold: In order to achieve this, the delta T between brine and air necessarily has to increase. So brine will be a bit colder and thus the heat pump’s Coefficient of Performance will be a bit lower. Over a full season including the warm periods of heating hot water only the effect is less pronounced – but we see a more significant change in performance data and brine inlet temperature for the ice months in the respective seasons.

|

# Data for the Heat Pump System: Heating Season 2016-2017

I update the documentation of measurement data [PDF] about twice a year. This post is to provide a quick overview for the past season.

The PDF also contains the technical configuration and sizing data. Based on typical questions from an ‘international audience’ I add a summary here plus some ‘cultural’ context:

Building: The house is a renovated, nearly 100-year old building in Eastern Austria: a typical so-called ‘Streckhof’ – an elongated, former small farmhouse. Some details are mentioned here. Heating energy for space heating of two storeys (185m2) and hot water is about 17.000-20.000kWh per year. The roof / attic had been rebuilt in 2008, and the facade was thermally insulated. However, the major part of the house is without an underground level, so most energy is lost via ground. Heating only the ground floor (75m2) with the heat pump reduces heating energy only by 1/3.

Climate: This is the sunniest region of Austria – the lowlands of the Pannonian Plain bordering Hungary. We have Pannonian ‘continental’ climate with low precipitation. Normally, monthly average temperatures in winter are only slightly below 0°C in January, and weeks of ‘ice days’ in a row are very rare.

Heat energy distribution and storage (in the house): The renovated first floor has floor loops while at the ground floor mainly radiators are used. Wall heating has been installed in one room so far. A buffer tank is used for the heating water as this is a simple ‘on-off’ heat pump always operating at about its rated power. Domestic hot water is heated indirectly using a hygienic storage tank.

Heating system. An off-the-shelf, simple brine-water heat pump uses a combination of an unglazed solar-air collector and an underwater water tank as a heat source. Energy is mainly harvested from rather cold air via convection.

Addressing often asked questions: Off-the-shelf =  Same type of heat pump as used with geothermal systems. Simple: Not-smart, not trying to be the universal energy management system, as the smartness in our own control unit and logic for managing the heat source(s). Brine: A mixture of glycol and water (similar to the fluid used with flat solar thermal collectors) = antifreeze as the temperature of brine is below 0°C in winter. The tank is not a seasonal energy storage but a buffer for days or weeks. In this post hydraulics is described in detail, and typical operating conditions throughout a year. Both tank and collector are needed: The tank provides a buffer of latent energy during ‘ice periods’ and it allows to harvest more energy from air, but the collector actually provides for about 75% of the total ambient energy the heat pump needs in a season.

Tank and collector are rather generously sized in relation to the heating demands: about 25m3 volume of water (total volume +10% freezing reserve) and 24m2 collector area.

The overall history of data documented in the PDF also reflects ongoing changes and some experiments, like heating the first floor with a wood stove, toggling the effective area of the collector used between 50% and 100%, or switching off the collector to simulate a harsher winter.

Data for the past season

Finally we could create a giant ice cube naturally. 14m3 of ice had been created in the coldest January since 30 years. The monthly average temperature was -3,6°C, 3 degrees below the long-term average.

(Re the oscillations of the ice volume are see here and here.)

We heated only the ground floor in this season and needed 16.600 kWh (incl. hot water) – about the same heating energy as in the previous season. On the other hand, we also used only half of the collector – 12m2. The heating water inlet temperatures for radiators was even 37°C in January.

For the first time the monthly performance factor was well below 4. The performance factor is the ratio of output heating energy and input electrical energy for heat pump and brine pump. In middle Europe we measure both energies in kWh ;-) The overall seasonal performance factor was 4,3.

The monthly performance factor is a bit lower again in summer, when only hot water is heated (and thus the heat pump’s COP is lower because of the higher target temperature).

Per day we needed about 100kWh of heating energy in January, while the collector could not harvest that much:

In contrast to the season of the Ice Storage Challenge, also the month before the ‘challenge’ (Dec. 2016) was not too collector-friendly. But when the ice melted again, we saw the usual large energy harvests. Overall, the collector could contribute not the full ‘typical’ 75% of ambient energy this season.

(Definitions, sign conventions explained here.)

But there was one positive record, too. In a hot summer of 2017 we consumed the highest cooling energy so far – about 600kWh. The floor loops are used for passive cooling; the heating buffer tank is used to transfer heat from the floor loops to the cold underground tank. In ‘colder’ summer nights the collector is in turn used to cool the tank, and every time hot tap water is heated up the tank is cooled, too.

Of course the available cooling power is just a small fraction of what an AC system for the theoretical cooling load would provide for. However, this moderate cooling is just what – for me – makes the difference between unbearable and OK on really hot days with more than 35°C peak ambient temperature.

|