Cloudy Troubleshooting (2)

Unrelated to part 1 – but the same genre.

Actors this time:

  • File Cloud: A cloud service for syncing and sharing files. We won’t drop a brand name, will we?
  • Client: Another user of File Cloud.
  • [Redacted]: Once known for reliability and as The Best Network.
  • Dark Platform: Wannabe hackers’ playground.
  • elkement: Somebody who sometimes just wants to be an end user, but always ends up sniffing and debugging.

There are no dialogues with human life-forms this time, only the elkement’s stream of consciousness, interacting with the others via looking at things at a screen.

elkement: Time for a challenging Sunday hack!

elkement connects to the The Dark Platform. Hardly notices anything in the real world anymore. But suddenly elkement looks at the clock – and at File Cloud’s icon next to it.

elkement: File Cloud, what’s going on?? Seems you have a hard time Connecting… for hours now? You have not even synced my hacker notes from yesterday evening?

elkement tries to avoid to look at File Cloud, but it gets too painful.

elkement: OK – let’s consider the File Cloud problem the real Sunday hacker’s challenge…

elkement walks through the imaginary checklist:

  • File Cloud mentioned on DownDetector website? No.
  • Users tweeting about outage? No.
  • Do the other cloudy apps work fine? Yes.
  • Do other web sites work fine? Yes.
  • Does my router needs its regular reboots because it’s DNS server got stuck? No.
  • Should I perhaps try the usual helpdesk recommendation? Yes. (*)

(*) elkement turns router and firewall off and on again. Does not help.

elkement gets worried about Client using File Cloud, too. Connects to Client’s network – via another cloudy app (that obviously also works).

  • Does Client has the same issues? Yes and No – Yes at one site, No at another site.

elkement: Oh no – do I have to setup a multi-dimensional test matrix again to check for weird dependencies?

Coffee Break. Leaving the hacker’s cave. Gardening.

elkement: OK, let’s try something new!

elkement connects to super shaky mobile internet via USB tethering on the smart phone.

  • Does an alternative internet connection fix File Cloud? Yes!!

elkement: Huh!? Will now again somebody explain to me that a protocol (File Cloud) is particularly sensitive to hardly noticeable network disconnects? Is it maybe really a problem with [Redacted] this time?

elkement checks out DownDetector – and there they are the angry users and red spots on the map. They mention that seemingly random websites and applications fail. And that [Redacted] is losing packets.

elkement: Really? Only packets for File Cloud?

elkement starts sniffing. Checks IP addresses.

(elkement: Great, whois does still work, despite the anticipated issues with GDPR!)

elkement spots communication with File Cloud. File Cloud client and server are stuck in a loop of misunderstandings. File Cloud client is rude and says: RST, then starts again. Says Hello. They never shake hands as a previous segment was not captured.

elkement: But why does all the other stuff work??

elkement googles harder. Indeed, some other sites might be slower – not The Dark Platform, fortunately. Now finally Google and duckduckgo stop working, too. 

elkement: I can’t hack without Google.

elkement hacks something without Google though. Managed to ignore File Cloud’s heartbreaking connection attempts.

A few hours later it’s over. File Cloud syncs hacker notes. Red spots on DownDetector start to fade out while the summer sun is setting.

~

FIN, ACK

Cloudy Troubleshooting

Actors:

  • Cloud: Service provider delivering an application over the internet.
  • Client: Business using the Cloud
  • Telco: Service provider operating part of the network infrastructure connecting them.
  • elkement: Somebody who always ends up playing intermediary.

~

Client: Cloud logs us off ever so often! We can’t work like this!

elkement: Cloud, what timeouts do you use? Client was only idle for a short break and is logged off.

Cloud: Must be something about your infrastructure – we set the timeout to 1 hour.

Client: It’s becoming worse – Cloud logs us off every few minutes even we are in the middle of working.

[elkement does a quick test. Yes, it is true.]

elkement: Cloud, what’s going on? Any known issue?

Cloud: No issue in our side. We have thousands of happy clients online. If we’d have issues, our inboxes would be on fire.

[elkement does more tests. Different computers at Client. Different logon users. Different Client offices. Different speeds of internet connections. Computers at elkement office.]

elkement: It is difficult to reproduce. It seems like it works well for some computers or some locations for some time. But Cloud – we did not have any issues of that kind in the last year. This year the troubles started.

Cloud: The timing of our app is sensitive: If network cards in your computers turn on power saving that might appear as a disconnect to us.

[elkement learns what she never wanted to know about various power saving settings. To no avail.]

Cloud: What about your bandwidth?… Well, that’s really slow. If all people in the office are using that connection we can totally understand why our app sees your users disappearing.

[elkement on a warpath: Tracking down each application eating bandwidth. Learning what she never wanted to know about tuning the background apps, tracking down processes.]

elkement: Cloud, I’ve throttled everything. I am the only person using Clients’ computers late at night, and I still encounter these issues.

Cloud: Upgrade the internet connection! Our protocol might choke on a hardly noticeable outage.

[elkement has to agree. The late-night tests were done over a remote connections; so measurement may impact results, as in quantum physics.]

Client: Telco, we buy more internet!

[Telco installs more internet, elkement measures speed. Yeah, fast!]

Client: Nothing has changed, Clouds still kicks us out every few minutes.

elkement: Cloud, I need to badger you again….

Cloud: Check the power saving settings of your firewalls, switches, routers. Again, you are the only one reporting such problems.

[The router is a blackbox operated by Telco]

elkement: Telco, does the router use any power saving features? Could you turn that off?

Telco: No we don’t use any power saving at all.

[elkement dreams up conspiracy theories: Sometimes performance seems to degrade after business hours. Cloud running backup jobs? Telco’s lines clogged by private users streaming movies? But sometimes it’s working well even in the location with the crappiest internet connection.]

elkement: Telco, we see this weird issue. It’s either Cloud, Client’s infrastructure, or anything in between, e.g. you. Any known issues?

Telco: No, but [proposal of test that would be difficult to do]. Or send us a Wireshark trace.

elkement: … which is what I planned to do anyway…

[elkement on a warpath 2: Sniffing, tracing every process. Turning off all background stuff. Looking at every packet in the trace. Getting to the level where there are no other packets in between the stream of messages between Client’s computers and Cloud’s servers.]

elkement: Cloud, I tracked it down. This is not a timeout. Look at the trace: Server and client communicating nicely, textbook three-way handshake, server says FIN! And no other packet in the way!

Cloud: Try to connect to a specific server of us.

[elkement: Conspiracy theory about load balancers]

elkement: No – erratic as ever. Sometimes we are logged off, sometimes it works with crappy internet. Note that Client could work during vacation last summer with supper shaky wireless connections.

[Lots of small changes and tests by elkement and Cloud. No solution yet, but the collaboration is seamless. No politics and finger-pointing who to blame – just work. The thing that keeps you happy as a netadmin / sysadmin in stressful times.]

elkement: Client, there is another interface which has less features. I am going to test it…

[elkement: Conspiracy theory about protocols. More night-time testing].

elkement: Client, Other Interface has the same problems.

[elkement on a warpath 3: Testing again with all possible combinations of computers, clients, locations, internet connections. Suddenly a pattern emerges…]

elkement: I see something!! Cloud, I believe it’s user-dependent. Users X and Y are logged off all the time while A and B aren’t.

[elkement scratches head: Why was this so difficult to see? Tests were not that unambiguous until now!]

Cloud: We’ve created a replacement user – please test.

elkement: Yes – New User works reliably all the time! 🙂

Client: It works –  we are not thrown off in the middle of work anymore!

Cloud: Seems that something about the user on our servers is broken – never happened before…

elkement: But wait 😦 it’s not totally OK: Now logged off after 15 minutes of inactivity? But never mind – at least not as bad as logged off every 2 minutes in the middle of some work.

Cloud: Yeah, that could happen – an issue with Add-On Product. But only if your app looks idle to our servers!

elkement: But didn’t you tell us that every timeout ever is no less than 1 hour?

Cloud: No – that 1 hour was another timeout …

elkement: Wow – classic misunderstanding! That’s why it is was so difficult to spot the pattern. So we had two completely different problems, but both looked like unwanted logoffs after a brief period, and at the beginning both weren’t totally reproducible.

[elkement’s theory validated again: If anything qualifies elkement for such stuff at all it was experience in the applied physics lab – tracking down the impact of temperature, pressure and 1000 other parameters on the electrical properties of superconductors… and trying to tell artifacts from reproducible behavior.]

~

Cloudy

Logging Fun with UVR16x2: Photovoltaic Generator – Modbus – CAN Bus

The Data Kraken wants to grow new tentacles.

I am playing with the CMI – Control and Monitoring Interface – the logger / ‘ethernet gateway’ connected to our control units (UVR1611, UVR16x2) via CAN bus. The CMI has become a little Data Kraken itself: Inputs and outputs can be created for CAN bus and Modbus, and data from most CAN devices can also be logged via JSON.

Are these features useful to integrate the datalogger of our photovoltaics inverter – Fronius Symo 4.5-3-M? I am now logging data to an USB stick and feed the CSV files to the SQL Server Data Kraken. The USB logger’s logging interval is 5 minutes whereas Modbus TCP allows for logging every few seconds. The inverter has built-in energy management features, but it can only ‘signal’ via a relay which also requires proper wiring. Modbus TCP, on the other hand, could use the existing WLAN connection of the inverter and the control unit could do something smarter with the sensor reading of the output power.

My motivation is to test if you – as an UVR16x2 user – can re-use a logger you  already have – the CMI – as much as possible, avoiding the need to run another ‘logging server’ all the time (Also my SQL Server is for analysis, not for real-time logging). I know that there are many open source Modbus clients available and that it is easy to write a Python script.

Activate Modbus on the inverter: I prefer floating numbers to integers plus a scaling factor, and I turn off the option to make changes via Modbus:

Modbus settings, Fronius Datalogger, inverter’s local web server. 502 is the TCO standard port. The alternative to floating numbers is integers plus a varying scaling factor (SF), to be found in another register.

Check Fronius documentation of its Modbus registers: The document is currently available here. There are different sets of registers related to the inverter or associated with one string of PV panels:

PDF p.97, Common Inverter Model. For logging AC output power you need:  Address 40092, type of register 3 (read and hold), datatype float32 (‘corresponds’ to two 16bit integer register, thus size 2).

The address to be configured on a Modbus client is smaller than this address by 1 – so 40091 needs to be set to log AC power.

Using these configuration parameters an analog Modbus Input is added at the CMI. The signal is ‘digital’ – but in field-bus-language everything that is not a single bit – 0/1 – seems to be ‘analog’.

Modbus input at the CMI. Input value:  32bits read from the bus interpreted as an integer. Actual value: Integer part of the ‘true value’ = the 32bits interpreted as 32-bit float.

Yes, I checked the network trace 😉 as the byte order dropdown menu confused me: According to the Modbus protocol specification Big Endian is required, not an option.

Factors and data types: Only integer values are understood by CAN devices. Decimal places might be indicated by a scaling factor. The PV power value in Watt has enough significant digits; so the integer part of the float number is fine. But for current in Ampere – typically about 15A maximum – a Factor of 10 would be better. It would not have helped to select int + scaling factor at the inverter: The scaling factor would be stored in a second register, there is a different factor for every parameter, and you cannot configure another ‘scaling factor register’ per input at the CMI. Theoretically you could log the scaling factor separately and re-scale the value in a custom application – but then I would use a separate, custom logger.

In any case, if you screw this up, you see non-sensible numbers of the CAN bus: Slowly evolving positive values – like PV power on a sunny day – are displayed as wild variations of signed integers between -32000 and 32000 😉

Where are the ‘logged’ data? The CMI is first and foremost the data logger for the control units. The CMI does not immediately store the data from Modbus inputs in a  local ‘logging database’. All I have achieved so far is to display the value on the Settings page. The CMI can only log values from the CAN bus or DL bus. So we need an…

… Analog CAN Output at the CMI:

The CMI has the default node number 56 on the CAN bus. Other CAN devices on the bus can query it for this parameter by specifying node 56 and output no 1.

These are the devices on our CAN bus:

CAN bus displayed on the CMI’s website. UVR1611 and UVR16x2 controllers can be managed by clicking their icons – which brings up a web page that resembles the controller’s local display.

The CMI’s Logging page looks tempting – can we simply select the CMI itself as a CAN logging source – CAN 56?

Configuration of the devices the CMI logs data from, via CAN bus. CAN 1 – UVR1611, CAN 2 – UVR16x3, CAN 41 – energy meter CAN-EZ.

Nothing stops you from selecting CAN 56 in this dropdown menu, but it does not end well:

CAN error message displayed at the logger CMI when you try to configure the CMI also as a logging source.

We need a round-trip: Data needs to be sent to a supported device first – one of the controllers on the CAN bus. We need an…

… Analog CAN input / network variable at the UVR16x2:

Configuration of a CAN input at the controller UVR16x2 (via CMI’s web interface to the controller).

The value of AC power is displayed as integer without scaling. Had a factor of 10 been used at the Modbus input it would be ‘corrected’ here, using the Unit called dimensionless,1.

logging-uvr16x2-can-network-input-can-value-display

Values received by the controller UVR16x2 over CAN bus.

Result of all this: UVR16x2 knows PV power and can use it do magic smart things when controlling the heat pump. On the other hand, CMI can log this value – in the same way it logs all other sensor readings.

Log files are retrieved by Winsol, the free logging software for the CMI …

Logged visualized with Winsol. Logfiles are downloaded from the CMI on the internal LAN or via Technsche Alternative’s web portal. PV power (PV.Leistung.Watt) is displayed together with global radiation on a vertical plane (GBS, at the solar/air collector for the theat pump), ambient temperature (red), temperature of solar/air collector (orange)

… or logging is configured at the web portal cmi.ta.co.at …

Configuration of logging at cmi.ta.co.at: Supported loggers are UVR1611 and UVR16x2. Values to be logged are selected from all direct inputs / outputs / functions and from CAN network inputs and outputs.

… and data can be viewed online:

Data visualized at cmi.ta.co.at. Data logged via CAN are sent from the CMI to the web portal.

Using this kind of logging for all values the inverter provides would be costly: It’s not just a column you add to a log file, but you occupy one of the limited inputs and outputs at the CMI and the controller. If you really need to know the voltage between phase 1 and 2 or apparent power you better stick with the USB file or use a separate Modbus logger like a Rasbperry Pi. This project is great and documented very well – data acqusition from a Symo inverter using Python plus a web front end.

Sending Modbus data back and forth from the CMI to UVR controllers is only worth the efforts if you need them for control, not for ‘nice-to-have’ logging.

Internet of Things. Yet Another Gloomy Post.

Technically, I work with Things, as in the Internet of Things.

As outlined in Everything as a Service many formerly ‘dumb’ products – such as heating systems – become part of service offerings. A vital component of the new services is the technical connection of the Thing in your home to that Big Cloud. It seems every energy-related system has got its own Internet Gateway now: Our photovoltaic generator has one, our control unit has one, and the successor of our heat pump would have one, too. If vendors don’t bundle their offerings soon, we’ll end up with substantial electricity costs for powering a lot of separate gateways.

Experts have warned for years that the Internet of Things (IoT) comes with security challenges. Many Things’ owners still keep default or blank passwords, but the most impressive threat is my opinion is not hacking individual systems: Easily hacked things can be hijacked to serve as zombie clients in a botnet and lauch a joint Distributed Denial of Service attack against a single target. Recently the blog of renowned security reporter Brian Krebs has been taken down, most likely as an act of revenge by DDoSers (Crime is now offered as a service as well.). The attack – a tsunami of more than 600 Gbps – was described as one of the largest the internet had seen so far. Hosting provider OVH was subject to a record-breaking Tbps attack – launched via captured … [cue: hacker movie cliché] … cameras and digital video recorders on the internet.

I am about the millionth blogger ‘reporting’ on this, nothing new here. But the social media news about the DDoS attacks collided with another social media micro outrage  in my mind – about seemingly unrelated IT news: HP had to deal with not-so-positive reporting about its latest printer firmware changes and related policies –  when printers started to refuse to work with third-party cartridges. This seems to be a legal issue or has been presented as such, and I am not interested in that aspect here. What I find interesting is the clash of requirements: After the DDoS attacks many commentators said IoT vendors should be held accountable. They should be forced to update their stuff. On the other hand, end users should remain owners of the IT gadgets they have bought, so the vendor has no right to inflict any policies on them and restrict the usage of devices.

I can relate to both arguments. One of my main motivations ‘in renewable energy’ or ‘in home automation’ is to make users powerful and knowledgable owners of their systems. On the other hand I have been ‘in security’ for a long time. And chasing firmware for IoT devices can be tough for end users.

It is a challenge to walk the tightrope really gracefully here: A printer may be traditionally considered an item we own whereas the internet router provided by the telco is theirs. So we can tinker with the printer’s inner workings as much as we want but we must not touch the router and let the telco do their firmware updates. But old-school devices are given more ‘intelligence’ and need to be connected to the internet to provide additional services – like that printer that allows to print from your smartphone easily (Yes, but only if your register it at the printer manufacturer’s website before.). In addition, our home is not really our castle anymore. Our computers aren’t protected by the telco’s router / firmware all the time, but we work in different networks or in public places. All the Things we carry with us, someday smart wearable technology, will check in to different wireless and mobile networks – so their security bugs should better be fixed in time.

If IoT vendors should be held accountable and update their gadgets, they have to be given the option to do so. But if the device’s host tinkers with it, firmware upgrades might stall. In order to protect themselves from legal persecution, vendors need to state in contracts that they are determined to push security updates and you cannot interfere with it. Security can never be enforced by technology only – for a device located at the end user’s premises.

It is horrible scenario – and I am not sure if I refer to hacking or to proliferation of even more bureaucracy and over-regulation which should protect us from hacking but will add more hurdles for would-be start-ups that dare to sell hardware.

Theoretically a vendor should be able to separate the security-relevant features from nice-to-have updates. For example, in a similar way, in smart meters the functions used for metering (subject to metering law) should be separated from ‘features’ – the latter being subject to remote updates while the former must not. Sources told me that this is not an easy thing to achieve, at least not as easy as presented in the meters’ marketing brochure.

Linksys's Iconic Router

That iconic Linksys router – sold since more than 10 years (and a beloved test devices of mine). Still popular because you could use open source firmware. Something that new security policies might seek to prevent.

If hardware security cannot be regulated, there might be more regulation of internet traffic. Internet Service Providers could be held accountable to remove compromised devices from their networks, for example after having noticed the end user several times. Or smaller ISPs might be cut off by upstream providers. Somewhere in the chain of service providers we will have to deal with more monitoring and regulation, and in one way or other the playful days of the earlier internet (romanticized with hindsight, maybe) are over.

When I saw Krebs’ site going offline, I wondered what small business should do in general: His site is now DDoS-protected by Google’s Project Shield, a service offered to independent journalists and activists after his former pro-bono host could not deal with the load without affecting paying clients. So one of the Siren Servers I commented on critically so often came to rescue! A small provider will not be able to deal with such attacks.

WordPress.com should be well-protected, I guess. I wonder if we will all end up hosting our websites at such major providers only, or ‘blog’ directly to Facebook, Google, or LinkedIn (now part of Microsoft) to be safe. I had advised against self-hosting WordPress myself: If you miss security updates you might jeopardize not only your website, but also others using the same shared web host. If you live on a platform like WordPress or Google, you will complain from time to time about limited options or feature updates you don’t like – but you don’t have to care about security. I compare this to avoiding legal issues as an artisan selling hand-made items via Amazon or the like, in contrast to having to update your own shop’s business logic after every change in international tax law.

I have no conclusion to offer. Whenever I read news these days – on technology, energy, IT, anything in between, The Future in general – I feel reminded of this tension: Between being an independent neutral netizen and being plugged in to an inescapable matrix, maybe beneficial but Borg-like nonetheless.

Hacking My Heat Pump – Part 2: Logging Energy Values

In the last post, I showed how to use Raspberry Pi as CAN bus logger – using a test bus connected to control unit UVR1611. Now I have connected it to my heat pump’s bus.

Credits for software and instructions:

Special thanks to SK Pang Electronics who provided me with CAN boards for Raspberry Pi after having read my previous post!!

CAN boards for Raspberry Pi, by SK Pang

CAN extension boards for Raspberry Pi, by SK Pang. Left: PiCAN 2 board (40 GPIO pins), right: smaller, retired PiCAN board with 26 GPIO pins – the latter fits my older Pi. In contrast to the board I used in the first tests, these have also a serial (DB9) interface.

Wiring CAN bus

We use a Stiebel-Eltron WPF 7 basic heat pump installed in 2012. The English website now refers to model WPF 7 basic s.

The CAN bus connections described in the German manual (Section 12.2.3) and the English manual (Wiring diagram, p.25) are similar:

Stiebel-Eltron WPF 7 basic - CAN bus connections shown in German manual

CAN bus connections inside WPF 7 basic heat pump. For reference, see the description of the Physical Layer of the CAN protocol. Usage of the power supply (BUS +) is optional.

H, L and GROUND wires from the Pi’s CAN board are connected to the respective terminals inside the heat pump. I don’t use the optional power supply as the CAN board is powered by Raspberry Pi, and I don’t terminate the bus correctly with 120 Ω. As with the test bus, wires are rather short and thus have low resistance.

Stiebel-Eltron WPF 7 basic - CAN bus connections inside the heat pump, cable from Raspberry Pi connected.

Heat pump with cover removed – CAN High (H – red), Low (L – blue), and Ground (yellow) are connected. The CAN cable is a few meters long and connects to the Raspberry Pi CAN board.

In the first tests Raspberry Pi had the privilege to overlook the heat pump room as the top of the buffer tank was the only spot the WLAN signal was strong enough …

Raspberry Pi, on top of the buffer tank

Typical, temporary nerd’s test setup.

… or I used a cross-over ethernet cable and a special office desk:

Working on the heat pump - Raspberry Pi adventures

Typical, temporary nerd’s workplace.

Now Raspberry Pi has its final position on the ‘organic controller board’, next to control unit UVR16x2 – and after a major upgrade to both LAN and WLAN all connections are reliable.

Raspberry Pi with PiCAN board from SK Pang and UVR16x2

Raspberry Pi with PiCAN board from SK Pang and UVR16x2 control unit from Technische Alternative (each connected to a different CAN bus).

Bringing up the interface

According to messpunkt.org the bit rate of Stiebel-Eltron’s bus is 20000 bit/s; so the interface is activated with:

sudo ip link set can0 type can bitrate 20000
sudo ifconfig can0 up

Watching the idle bus

First I was simply watching with sniffer Wireshark if the heat pump says anything without being triggered. It does not – only once every few minutes there are two packets. So I need to learn to talk to it.

Learning about CAN communications

SK Pang provides an example of requesting data using open source tool cansend: The so-called CAN ID is followed by # and the actual data. This CAN ID refers to an ‘object’ – a set of properties of the device, like the set of inputs or outputs – and it can contain also the node ID of the device on the bus. There are many CAN tutorials on the net, I found this (German) introduction and this English tutorial very useful.

I was able to follow the communications of the two nodes in my test bus as I knew their node numbers and what to expect – the data logger would ask the controller for a set of configured sensor outputs every minute. Most packets sent by either bus member are related to object 480, indicating the transmission of a set of values (Process Data Exchange Objects, PDOs. More details on UVR’s CAN communication, in German)

Network trace on test CAN bus: UVR1611 and BL-NET

Sniffing test CAN bus – communication of UVR1611 (node no 1) and logger BL-NET (node number 62 = 3e). Both devices use an ID related to object ID 480 plus their respective node number, as described here.

So I need to know object ID(s) and properly formed data values to ask the heat pump for energy readings – without breaking something by changing values.

Collecting interesting heat pump parameters for monitoring

I am very grateful for Jürg’s CAN tool can_scan that allow for querying a Stiebel-Eltron heat pump for specific values and also for learning about all possible parameters (listed in so-called Elster tables).

In order to check the list of allowed CAN IDs used by the heat pump I run:

./can_scan can0 680

can0 is the (default) name of the interface created earlier and 680 is my (the sender’s) CAN ID, one of the IDs allowed by can_scan.

Start of output:

elster-kromschroeder can-bus address scanner and test utility
copyright (c) 2014 Jürg Müller, CH-5524

scan on CAN-id: 680
list of valid can id's:

  000 (8000 = 325-07)
  180 (8000 = 325-07)
  301 (8000 = 325-07)
  480 (8000 = 325-07)
  601 (8000 = 325-07)

In order to investigate available values and their meaning I run can_scan for each of these IDs:

./can_scan can0 680 180

Embedded below is part of the output, containing some of the values (and /* Comments */). This list of parameters is much longer than the list of values available via the display on the heat pump!

I am mainly interested in metered energies and current temperatures of the heat source (brine) and the ‘environment’ – to compare these values to other sensors’ output:

elster-kromschroeder can-bus address scanner and test utility
copyright (c) 2014 Jürg Müller, CH-5524

0001:  0000  (FEHLERMELDUNG  0)
0003:  019a  (SPEICHERSOLLTEMP  41.0)
0005:  00f0  (RAUMSOLLTEMP_I  24.0)
0006:  00c8  (RAUMSOLLTEMP_II  20.0)
0007:  00c8  (RAUMSOLLTEMP_III  20.0)
0008:  00a0  (RAUMSOLLTEMP_NACHT  16.0)
0009:  3a0e  (UHRZEIT  14:58)
000a:  1208  (DATUM  18.08.)
000c:  00e9  (AUSSENTEMP  23.3) /* Ambient temperature */
000d:  ffe6  (SAMMLERISTTEMP  -2.6)
000e:  fe70  (SPEICHERISTTEMP  -40.0)
0010:  0050  (GERAETEKONFIGURATION  80)
0013:  01e0  (EINSTELL_SPEICHERSOLLTEMP  48.0)
0016:  0140  (RUECKLAUFISTTEMP  32.0) /* Heating water return temperature */
...
01d4:  00e2  (QUELLE_IST  22.6) /* Source (brine) temperature */
...
/* Hot tap water heating energy MWh + kWh */
/* Daily totaly */   
092a:  030d  (WAERMEERTRAG_WW_TAG_WH  781)
092b:  0000  (WAERMEERTRAG_WW_TAG_KWH  0)
/* Total energy since system startup */
092c:  0155  (WAERMEERTRAG_WW_SUM_KWH  341)
092d:  001a  (WAERMEERTRAG_WW_SUM_MWH  26)
/* Space heating energy, MWh + kWh */
/* Daily totals */
092e:  02db  (WAERMEERTRAG_HEIZ_TAG_WH  731)
092f:  0006  (WAERMEERTRAG_HEIZ_TAG_KWH  6)
/* Total energy since system startup */
0930:  0073  (WAERMEERTRAG_HEIZ_SUM_KWH  115)
0931:  0027  (WAERMEERTRAG_HEIZ_SUM_MWH  39)

Querying for one value

The the heating energy to date in MWh corresponds to index 0931:

./can_scan can0 680 180.0931

The output of can_scan already contains the sum of the MWh (0931) and kWh (0930) values:

elster-kromschroeder can-bus address scanner and test utility
copyright (c) 2014 Jürg Müller, CH-5524

value: 0027  (WAERMEERTRAG_HEIZ_SUM_MWH  39.115)

The network trace shows that the logger (using ID 680) queries for two values related to ID 180 – the kWh and the MWh part:

Network trace on heat pump's CAN bus: Querying for space heating energy to date.

Network trace of Raspberry Pi CAN logger (ID 680) querying CAN ID 180. Since the returned MWh value is the sum of MWh and kWh value, two queries are needed. Detailed interpretation of packets in the text below.

Interpretation of these four packets – as explained on Jürg’s website here and here in German:

00 00 06 80 05 00 00 00 31 00 fa 09 31  
00 00 01 80 07 00 00 00 d2 00 fa 09 31 00 27
00 00 06 80 05 00 00 00 31 00 fa 09 30 
00 00 01 80 07 00 00 00 d2 00 fa 09 30 00 73
|---------| ||          |---| || |---| |---|
1)          2)          3)    4) 5)    6)

1) CAN-ID used by the sender: 180 or 680 
2) No of bytes of data - 5 for queries, 7 for replies
3) CAN ID of the communications partner and type of message. 
For queries the second digit is 1. 
Pattern: n1 0m with n = 180 / 80 = 3 (hex) and m = 180 mod 8 = 0 
(hex) Partner ID = 30 * 8 (hex) + 00 = 180 
Responses follow a similar pattern using second digit 2: 
Partner ID is: d0 * 8 + 00 = 680 
4) fa indicates that the Elster index no is greater equal ff. 
5) Index (parameter) queried for: 0930 for kWh and 0931 for MWh
6) Value returned 27h=39,73h=115

I am not sure which node IDs my logger and the heat pump use as the IDs. 180 seems to be an object ID without node ID added while 301 would refer to object ID + node ID 1. But I suppose with two devices on the bus only, and one being only a listener, there is no ambiguity.

Logging script

I found all interesting indices listed under CAN ID 180; so am now looping through this set once every three minutes with can_scan, cut out the number, and add it to a new line in a text log file. The CAN interfaces is (re-)started every time in case something happens, and the file is sent to my local server via FTP.

Every month a new log file is started, and log files – to be imported into my SQL Server  and processed as log files from UVR1611 / UVR16x2, the PV generator’s inverter, or the smart meter.

(Not the most elegant script – consider it a ‘proof of concept’! Another option is to trigger the sending of data with can_scan and collect output via can_logger.)

Interesting to-be-logged parameters are added to a ‘table’ – a file called indices:

0016 RUECKLAUFISTTEMP
01d4 QUELLE_IST
01d6 WPVORLAUFIST
091b EL_AUFNAHMELEISTUNG_WW_TAG_KWH
091d EL_AUFNAHMELEISTUNG_WW_SUM_MWH
091f EL_AUFNAHMELEISTUNG_HEIZ_TAG_KWH
0921 EL_AUFNAHMELEISTUNG_HEIZ_SUM_MWH
092b WAERMEERTRAG_WW_TAG_KWH
092f WAERMEERTRAG_HEIZ_TAG_KWH
092d WAERMEERTRAG_WW_SUM_MWH
0931 WAERMEERTRAG_HEIZ_SUM_MWH
000c AUSSENTEMP
0923 WAERMEERTRAG_2WE_WW_TAG_KWH
0925 WAERMEERTRAG_2WE_WW_SUM_MWH
0927 WAERMEERTRAG_2WE_HEIZ_TAG_KWH
0929 WAERMEERTRAG_2WE_HEIZ_SUM_MWH

Script:

# Define folders
logdir="/CAN_LOGS"
scriptsdir="/CAN_SCRIPTS"
indexfile="$scriptsdir/indices"

# FTP parameters
ftphost="FTP_SERVER"
ftpuser="FTP_USER"
ftppw="***********"

# Exit if scripts not found
if ! [ -d $scriptsdir ] 
then
    echo Directory $scriptsdir does not exist!
    exit 1
fi

# Create log dir if it does not exist yet
if ! [ -d $logdir ] 
then
    mkdir $logdir
fi

sleep 5

echo ======================================================================

# Start logging
while [ 0 -le 1 ]
do

# Get current date and start new logging line
now=$(date +'%Y-%m-%d;%H:%M:%S')
line=$now
year=$(date +'%Y')
month=$(date +'%m')
logfile=$year-$month-can-log-wpf7.csv
logfilepath=$logdir/$logfile

# Create a new file for every month, write header line
# Create a new file for every month
if ! [ -f $logfilepath ] 
then
    headers="Datum Uhrzeit"
    while read indexline
    do 
        header=$(echo $indexline | cut -d" " -f2) 
        headers+=";"$header
    done < $indexfile ; echo "$headers" > $logfilepath 
fi

# (Re-)start CAN interface
    sudo ip link set can0 type can bitrate 20000
    sudo ip link set can0 up

# Loop through interesting Elster indices
while read indexline
do 
    # Get output of can_scan for this index, search for line with output values
    index=$(echo $indexline | cut -d" " -f1)
    value=$($scriptsdir/./can_scan can0 680 180.$index | grep "value" | replace ")" "" | grep -o "\<[0-9]*\.\?[0-9]*$" | replace "." ",")     
    echo "$index $value"     

    # Append value to line of CSV file     
    line="$line;$value" 
done < $indexfile ; echo $line >> $logfilepath

# echo FTP log file to server
ftp -n -v $ftphost << END_SCRIPT
ascii
user $ftpuser $ftppw
binary
cd RPi
ls
lcd $logdir
put $logfile
ls
bye
END_SCRIPT

echo "------------------------------------------------------------------"

# Wait - next logging data point
sleep 180

# Runs forever, use Ctrl+C to stop
done

In order to autostart the script I added a line to the rc.local file:

su pi -c '/CAN_SCRIPTS/pkt_can_monitor'

Using the logged values

In contrast to brine or water temperature heating energies are not available on the heat pump’s CAN bus in real-time: The main MWh counter is only incremented once per day at midnight. Then the daily kWh counter is added to the previous value.

Daily or monthly energy increments are calculated from the logged values in the SQL database and for example used to determine performance factors (heating energy over electrical energy) shown in our documentation of measurement data for the heat pump system.

Have I Seen the End of E-Mail?

Not that I desire it, but my recent encounters of ransomware make me wonder.

Some people in say, accounting or HR departments are forced to use e-mail with utmost paranoia. Hackers send alarmingly professional e-mails that look like invoices, job applications, or notifications of postal services. Clicking a link starts the download of malware that will encrypt all your data and ask for ransom.

Theoretically you could still find out if an e-mail was legit by cross-checking with open invoices, job ads, and expected mail. But what if hackers learn about your typical vendors from your business website or if they read your job ads? Then they would send plausible e-mails and might refer to specific codes, like the number of your job ad.

Until recently I figured that only medium or larger companies would be subject to targeted attacks. One major Austrian telco was victim of a Denial of Service attacked and challenged to pay ransom. (They didn’t, and were able to deal with the attack successfully.)

But then I have encountered a new level of ransomware attacks – targeting very small Austrian businesses by sending ‘expected’ job applications via e-mail:

  • The subject line was Job application as [a job that had been advertised weeks ago at a major governmental job service platform]
  • It was written in flawless German, using typical job applicant’s lingo as you learn in trainings.
  • It was addressed to the personal e-mail of the employee dealing with applications, not the public ‘info@’ address of the business
  • There was no attachment – so malware filters could not have found anything suspicious – but only a link to a shared cloud folder (‘…as the attachments are too large…’) – run by a a legit European cloud company.
  • If you clicked the link (which you should not so unless you do this on a separate test-for-malware machine in a separate network) you saw a typical applicant’s photo and a second file – whose name translated to JobApplicationPDF.exe.

Suspicious features:

  • The EXE file should have triggered red lights. But it is not impossible that a job application creates a self-extracting archive, although I would compare that to wrapping your paper application in a box looking like a fake bomb.
  • Google’s Image Search showed that the photo has been stolen from a German photographer’s website – it was an example for a typical job applicant’s photo.
  • Both cloud and mail service used were less known ones. It has been reported that Dropbox had removed suspicious files so it seemed that attackers tuned to alternative services. (Both mail and cloud provider reacted quickly and sht down the suspicious accounts)
  • The e-mail did not contain a phone number or street address, just the pointer to the cloud store: Possible but weird as an applicant should be eager to encourage communications via all channels. There might be ‘normal’ issues with accessing a cloud store link (e.g. link falsely blocked by corporate firewall) – so the HR department should be able to call the applicant.
  • Googling the body text of the e-mail gave one result only – a new blog entry of an IT professional quoting it at full length. The subject line was personalized to industry sector and a specific job ad – but the bulk of the text was not.
  • The non-public e-mail address of the HR person was googleable as the job ad plus contact data appeared on a job platform in a different language and country, without the small company’s consent of course. So harvesting both e-mail address and job description automatically.

I also wonder if my Everything as a Service vision will provide a cure: More and more communication has been moved to messaging on social networks anyway – for convenience and avoiding false negative spam detection. E-Mail – powered by old SMTP protocol with tacked on security features, run on decentralized mail servers – is being replaced by messaging happening within a big monolithic block of a system like Facebook messaging. Some large employer already require their applications to submit their CVs using their web platforms, as well as large corporations demand that their suppliers use their billing platform instead of sending invoices per e-mail.

What needs to be avoided is downloading an executable file and executing it in an environment not controlled by security policies. A large cloud provider might have a better chance to enforce security, and viewing or processing an ‘attachment’ could happen in the provider’s environment. As an alternative all ‘our’ devices might be actually be part of a service and controlled more tightly by centrally set policies. Disclaimer: Not sure if I like that.

Iconic computer virus - from my very first small business website in 1997. Image credits mine.

(‘Computer virus’ – from my first website 1997. Credits mine)

 

Everything as a Service

Three years ago I found a research paper that proposed a combination of distributed computing and heating as a service: A cloud provider company like Google or Amazon would install computers in users’ homes – as black-boxes providing heat to the users and computing power to their cloud.

In the meantime I have encountered announcements of startups very similar to this idea. So finally after we have been reading about the Internet of Things every day, buzz words associated with IT infrastructure enter the real world of hand-on infrastructure.

I believe that heating will indeed be offered as a service and like cloud-based IT services: The service provider will install a box in your cellar – a black-box in terms of user access, more like a home router operated by the internet provider today. It will be owned and operated by a provider you have a service contract with. There will be defined and restricted interfaces for limited control and monitoring – such as setting non-critical parameters like room temperature or viewing hourly and daily statistics.

Heating boxes will get smaller, more compact, and more aesthetically pleasing. They might rather be put in the hall rather than being tucked away in a room dedicated to technical gadgets. This is in line with a trend of smaller and smaller boiler rooms for larger and larger houses. Just like computers and routers went from ugly, clunky boxes to sleek design and rounded corners, heating boxes will more look like artistic stand-alone pillars. I remember a German startup which offered home batteries this beautiful a few years back – but they switched to another business model as they seem to have been too early.

Vendors of heating systems will try to simplify their technical and organizational interfaces with contractors: As one vendor of heat pump systems told me they were working on a new way of exchanging parts all at once so that a technician certified in handling refrigerants will not be required. Anything that can go wrong on installation will go wrong no matter how detailed the checklist for the installer is – also inlet and outlet do get confused. A vendor’s vision is rather a self-contained box delivered to the client, including heating system(s), buffer storage tanks for heating water, and all required sensors, electrical wiring, and hydraulic connections between these systems – and there are solutions like that offered today.

The vendor will have secured access to this system over the internet. They will be able to monitor continuously, detect errors early and automatically, and either fix them remotely or notify the customer. In addition, vendors will be able to optimize their designed by analyzing consolidated data gathered from a large number of clients’ systems. This will work exactly in the way vendors of inverters for photovoltaic systems deal with clients’ data already today: User get access to a cloud-based portal and show off their systems and data, and maybe enter a playful competition with other system owners – what might work for smart metering might work for related energy systems, too. The vendor will learn about systems’ performance data for different geographical regions and different usage patterns.

District heating is already offered as a service today: The user is entitled to using hot water (or cold water in case a heat pump’s heat source is shared among different users). Users sometimes dislike the lack of control and the fact they cannot opt out – as district heating only works economically if a certain number of homes in a certain area is connected to the service. But in some pilot areas in Germany and Austria combined heat and power stations have already been offered as a service and a provider-operated black-box in the user’s home.

The idea of having a third external party operating essential infrastructure now owned by an end-user may sound uncommon but we might get used to it when gasoline-powered cars in a user’s possession will be replaced by electrical vehicles and related services: like having a service contractor for a battery instead of owning it. We used to have our own computer with all our data on it, and we used to download our e-mail onto it, delete it from the server, and deal with local backups. Now all of that is stored on a server owned by somebody else and which we share with other users. The incentive is the ease of access to our data from various devices and the included backup service.

I believe that all kinds of things and products as a service will be further incentivized by bundling traditionally separate products: I used to joke about the bank account bundled with electrical power, home insurance, and an internet plus phone flat rate – until the combined bank account and green power offering was shown on my online banking’s home screen. Bundling all these services will be attractive, and users might be willing to trade in their data for a much cheaper access to services – just as a non-sniffing smart phone is more expensive than its alternatives.

Heat pump - not cloud-powered.I withhold judgement as I think there is a large grey and blurry area between allegedly evil platforms that own our lives and justified outsourcing to robust and transparent services that are easy to use also by the non tech-savvy.

Update 2016-06-02 : Seems I could not withold judgement in the comments 🙂 I better admit it here as the pingback from the book Service Innovation’s blog here might seem odd otherwise 😉

The gist of my argument made in the comments was:

I believe that artisans and craftsmen will belong in one of two categories in the future:
1) Either working as subcontractor, partner, or franchisee of large vendors, selling and installing standardized products – covering the last mile not accessible to robots and software (yet),
2) Or a lucky few will carve out a small niche and produce or customize bespoke units for clients who value luxurious goods for the sake of uniqueness or who value human imperfection as a fancy extra.

In other communication related to this post I called this platform effects Nassim Taleb’s Extremistan versus Mediocristan in action – the platform takes it all. Also ever growing regulation will help platforms rather than solo artisans as only large organizations can deal effectively with growing requirements re compliance – put forth both by government and by large clients or large suppliers.