The Orphaned Internet Domain Risk

I have clicked on company websites of social media acquaintances, and something is not right: Slight errors in formatting, encoding errors for special German characters.

Then I notice that some of the pages contain links to other websites that advertize products in a spammy way. However, the links to the spammy sites are embedded in this alleged company websites in a subtle way: Using the (nearly) correct layout, or  embedding the link in a ‘news article’ that also contains legit product information – content really related to the internet domain I am visiting.

Looking up whois information tells me that these internet domain are not owned by my friends anymore – consistent with what they actually say on the social media profiles. So how come that they ‘have given’ their former domains to spammers? They did not, and they didn’t need to: Spammers simply need to watch out for expired domains, seize them when they are available – and then reconstruct the former legit content from public archives, and interleave it with their spammy messages.

The former content of legitimate sites is often available on the web archive. Here is the timeline of one of the sites I checked:

Clicking on the details shows:

  • Last display of legit content in 2008.
  • In 2012 and 2013 a generic message from the hosting provider was displayed: This site has been registered by one of our clients
  • After that we see mainly 403 Forbidden errors – so the spammers don’t want their site to be archived – but at one time a screen capture of the spammy site had been taken.

The new site shows the name of the former owner at the bottom but an unobtrusive link had been added, indicating the new owner – a US-based marketing and SEO consultancy.

So my take away is: If you ever feel like decluttering your websites and free yourself of your useless digital possessions – and possibly also social media accounts, think twice: As soon as your domain or name is available, somebody might take it, and re-use and exploit your former content and possibly your former reputation for promoting their spammy stuff in a shady way.

This happened a while ago, but I know now it can get much worse: Why only distribute marketing spam if you can distribute malware through channels still considered trusted? In this blog post Malwarebytes raises the question if such practices are illegal or not – it seems that question is not straight-forward to answer.

Visitors do not even have to visit the abandoned domain explicitly to get hacked by malware served. I have seen some reports of abandoned embedded plug-ins turned into malicious zombies. Silly example: If you embed your latest tweets, Twitter goes out-of-business, and its domains are seized by spammers – you Follow Me icon might help to spread malware.

If a legit site runs third-party code, they need to trust the authors of this code. For example, Equifax’ website recently served spyware:

… the problem stemmed from a “third-party vendor that Equifax uses to collect website performance data,” and that “the vendor’s code running on an Equifax Web site was serving malicious content.”

So if you run any plug-ins, embedded widgets or the like – better check out regularly if the originating domain is still run by the expected owner – monitor your vendors often; and don’t run code you do not absolutely need in the first place. Don’t use embedded active badges if a simple link to your profile would do.

Do a painful boring inventory and assessment often – then you will notice how much work it is to manage these ‘partners’ and rather stay away from signing up and registering for too much services.

Update 2017-10-25: And as we speak, we learn about another example – snatching a domain used for a Dell backup software, preinstalled on PCs.

Other People Have Lives – I Have Domains

These are just some boring update notifications from the elkemental Webiverse.

The elkement blog has recently celebrated its fifth anniversary, and the punktwissen blog will turn five in December. Time to celebrate this – with new domain names that says exactly what these sites are – the ‘elkement.blog‘ and the ‘punktwissen.blog‘.

Actually, I wanted to get rid of the ads on both blogs, and with the upgrade came a free domain. WordPress has a detailed cookie policy – and I am showing it dutifully using the respective widget, but they have to defer to their partners when it comes to third-party cookies. I only want to worry about research cookies set by Twitter and Facebook, but not by ad providers, and I am also considering to remove social media sharing buttons and the embedded tweets. (Yes, I am thinking about this!)

On the websites under my control I went full dinosaur, and the server sends only non-interactive HTML pages sent to the client, not requiring any client-side activity. I now got rid of the last half-hearted usage of a session object and the respective cookie, and I have never used any social media buttons or other tracking.

So there are no login data or cookies to protect, but yet I finally migrated all sites to HTTPS.

It is a matter of principle: I of all website owners should use https. Since 15 years I have been planning and building Public Key Infrastructures and troubleshooting X.509 certificates.

But of course I fear Google’s verdict: They have announced long ago to HTTPS is considered a positive ranking by its search engine. Pages not using HTTPS will be tagged as insecure using more and more terrifying icons – e.g. http-only pages with login buttons already display a striked-through padlock in Firefox. In the past years I migrated a lot of PKIs from SHA1 to SHA256 to fight the first wave of Insecure icons.

Finally Let’s Encrypt has started a revolution: Free SSL certificates, based on domain validation only. My hosting provider uses a solution based on Let’s Encrypt – using a reverse proxy that does the actual HTTPS. I only had to re-target all my DNS records to the reverse proxy – it would have been very easy would it not have been for all my already existing URL rewriting and tweaking and redirecting. I also wanted to keep the option of still using HTTP in the future for tests and special scenario (like hosting a revocation list), so I decided on redirecting myself in the application(s) instead of using the offered automated redirect. But a code review and clean-up now and then can never hurt 🙂 For large complex sites the migration to HTTPS is anything but easy.

In case I ever forget which domains and host names I use, I just need to check out this list of Subject Alternative Names again:

(And I have another certificate for the ‘test’ host names that I need for testing the sites themselves and also for testing various redirects ;-))

WordPress.com also uses Let’s Encrypt (Automattic is a sponsor), and the SAN elkement.blog is lumped together with several other blog names, allegedly the ones which needed new certificates at about the same time.

It will be interesting what the consequences for phishing websites will be. Malicious websites will look trusted as being issued certificates automatically, but revoking a certificate might provide another method for invalidating a malicious website.

Anyway, special thanks to the WordPress.com Happiness Engineers and support staff at my hosting provider Puaschitz IT. Despite all the nerdiness displayed on this blog I prefer hosted / ‘shared’ solutions when it comes to my own websites because I totally like it when somebody else has to patch the server and deal with attacks. I am an annoying client – with all kinds of special needs and questions – thanks for the great support! 🙂

Give the ‘Thing’ a Subnet of Its Own!

To my surprise, the most clicked post ever on this blog is this:

Network Sniffing for Everyone:
Getting to Know Your Things (As in Internet of Things)

… a step-by-step guide to sniff the network traffic of your ‘things’ contacting their mothership, plus a brief introduction to networking. I wanted to show how you can trace your networked devices’ traffic without any specialized equipment but being creative with what many users might already have, by turning a Windows PC into a router with Internet Connection Sharing.

Recently, an army of captured things took down part of the internet, and this reminded me of this post. No, this is not one more gloomy article about the Internet of Things. I just needed to use this Internet Sharing feature for the very purpose it was actually invented.

The Chief Engineer had finally set up the perfect test lab for programming and testing freely programmable UVR16x2 control systems (successor of UVR1611). But this test lab was a spot not equipped with wired ethernet, and the control unit’s data logger and ethernet gateway, so-called CMI (Control and Monitoring Interface), only has a LAN interface and no WLAN.

So an ages-old test laptop was revived to serve as a router (improving its ecological footprint in passing): This notebook connects to the standard ‘office’ network via WLAN: This wireless connection is thus the internet connection that can be shared with a device connected to the notebook’s LAN interface, e.g. via a cross-over cable. As explained in detail in the older article the router-laptop then allows for sniffing the traffic, – but above all it allows the ‘thing’ to connect to the internet at all.

This is the setup:

Using a notebook with Internet Connection Sharing enabled as a router to connect CMI (UVR16x2's ethernet gatway) to the internet

The router laptop is automatically configured with IP address 192.168.137.1 and hands out addresses in the 192.168.137.x network as a DHCP server, while using an IP address provided by the internet router for its WLAN adapter (indicated here as commonly used 192.168.0.x addresses). If Windows 10 is used on the router-notebook, you might need to re-enable ICS after a reboot.

The control unit is connected to the CMI via CAN bus – so the combination of test laptop, CMI, and UVR16x2 control unit is similar to the setup used for investigating CAN monitoring recently.

The CMI ‘thing’ is tucked away in a private subnet dedicated to it, and it cannot be accessed directly from any ‘Office PC’ – except the router PC itself. A standard office PC (green) effectively has to access the CMI via the same ‘cloud’ route as an Internet User (red). This makes the setup a realistic test for future remote support – when the CMI plus control unit has been shipped to its proud owner and is configured on the final local network.

The private subnet setup is also a simple workaround in case several things can not get along well with each other: For example, an internet TV service flooded CMI’s predecessor BL-NET with packets that were hard to digest – so BL-NET refused to work without a further reboot. Putting the sensitive device in a private subnet – using a ‘spare part’ router, solved the problem.

The Chief Engineer's quiet test lab for testing and programming control units

Internet of Things. Yet Another Gloomy Post.

Technically, I work with Things, as in the Internet of Things.

As outlined in Everything as a Service many formerly ‘dumb’ products – such as heating systems – become part of service offerings. A vital component of the new services is the technical connection of the Thing in your home to that Big Cloud. It seems every energy-related system has got its own Internet Gateway now: Our photovoltaic generator has one, our control unit has one, and the successor of our heat pump would have one, too. If vendors don’t bundle their offerings soon, we’ll end up with substantial electricity costs for powering a lot of separate gateways.

Experts have warned for years that the Internet of Things (IoT) comes with security challenges. Many Things’ owners still keep default or blank passwords, but the most impressive threat is my opinion is not hacking individual systems: Easily hacked things can be hijacked to serve as zombie clients in a botnet and lauch a joint Distributed Denial of Service attack against a single target. Recently the blog of renowned security reporter Brian Krebs has been taken down, most likely as an act of revenge by DDoSers (Crime is now offered as a service as well.). The attack – a tsunami of more than 600 Gbps – was described as one of the largest the internet had seen so far. Hosting provider OVH was subject to a record-breaking Tbps attack – launched via captured … [cue: hacker movie cliché] … cameras and digital video recorders on the internet.

I am about the millionth blogger ‘reporting’ on this, nothing new here. But the social media news about the DDoS attacks collided with another social media micro outrage  in my mind – about seemingly unrelated IT news: HP had to deal with not-so-positive reporting about its latest printer firmware changes and related policies –  when printers started to refuse to work with third-party cartridges. This seems to be a legal issue or has been presented as such, and I am not interested in that aspect here. What I find interesting is the clash of requirements: After the DDoS attacks many commentators said IoT vendors should be held accountable. They should be forced to update their stuff. On the other hand, end users should remain owners of the IT gadgets they have bought, so the vendor has no right to inflict any policies on them and restrict the usage of devices.

I can relate to both arguments. One of my main motivations ‘in renewable energy’ or ‘in home automation’ is to make users powerful and knowledgable owners of their systems. On the other hand I have been ‘in security’ for a long time. And chasing firmware for IoT devices can be tough for end users.

It is a challenge to walk the tightrope really gracefully here: A printer may be traditionally considered an item we own whereas the internet router provided by the telco is theirs. So we can tinker with the printer’s inner workings as much as we want but we must not touch the router and let the telco do their firmware updates. But old-school devices are given more ‘intelligence’ and need to be connected to the internet to provide additional services – like that printer that allows to print from your smartphone easily (Yes, but only if your register it at the printer manufacturer’s website before.). In addition, our home is not really our castle anymore. Our computers aren’t protected by the telco’s router / firmware all the time, but we work in different networks or in public places. All the Things we carry with us, someday smart wearable technology, will check in to different wireless and mobile networks – so their security bugs should better be fixed in time.

If IoT vendors should be held accountable and update their gadgets, they have to be given the option to do so. But if the device’s host tinkers with it, firmware upgrades might stall. In order to protect themselves from legal persecution, vendors need to state in contracts that they are determined to push security updates and you cannot interfere with it. Security can never be enforced by technology only – for a device located at the end user’s premises.

It is horrible scenario – and I am not sure if I refer to hacking or to proliferation of even more bureaucracy and over-regulation which should protect us from hacking but will add more hurdles for would-be start-ups that dare to sell hardware.

Theoretically a vendor should be able to separate the security-relevant features from nice-to-have updates. For example, in a similar way, in smart meters the functions used for metering (subject to metering law) should be separated from ‘features’ – the latter being subject to remote updates while the former must not. Sources told me that this is not an easy thing to achieve, at least not as easy as presented in the meters’ marketing brochure.

Linksys's Iconic Router

That iconic Linksys router – sold since more than 10 years (and a beloved test devices of mine). Still popular because you could use open source firmware. Something that new security policies might seek to prevent.

If hardware security cannot be regulated, there might be more regulation of internet traffic. Internet Service Providers could be held accountable to remove compromised devices from their networks, for example after having noticed the end user several times. Or smaller ISPs might be cut off by upstream providers. Somewhere in the chain of service providers we will have to deal with more monitoring and regulation, and in one way or other the playful days of the earlier internet (romanticized with hindsight, maybe) are over.

When I saw Krebs’ site going offline, I wondered what small business should do in general: His site is now DDoS-protected by Google’s Project Shield, a service offered to independent journalists and activists after his former pro-bono host could not deal with the load without affecting paying clients. So one of the Siren Servers I commented on critically so often came to rescue! A small provider will not be able to deal with such attacks.

WordPress.com should be well-protected, I guess. I wonder if we will all end up hosting our websites at such major providers only, or ‘blog’ directly to Facebook, Google, or LinkedIn (now part of Microsoft) to be safe. I had advised against self-hosting WordPress myself: If you miss security updates you might jeopardize not only your website, but also others using the same shared web host. If you live on a platform like WordPress or Google, you will complain from time to time about limited options or feature updates you don’t like – but you don’t have to care about security. I compare this to avoiding legal issues as an artisan selling hand-made items via Amazon or the like, in contrast to having to update your own shop’s business logic after every change in international tax law.

I have no conclusion to offer. Whenever I read news these days – on technology, energy, IT, anything in between, The Future in general – I feel reminded of this tension: Between being an independent neutral netizen and being plugged in to an inescapable matrix, maybe beneficial but Borg-like nonetheless.

Have I Seen the End of E-Mail?

Not that I desire it, but my recent encounters of ransomware make me wonder.

Some people in say, accounting or HR departments are forced to use e-mail with utmost paranoia. Hackers send alarmingly professional e-mails that look like invoices, job applications, or notifications of postal services. Clicking a link starts the download of malware that will encrypt all your data and ask for ransom.

Theoretically you could still find out if an e-mail was legit by cross-checking with open invoices, job ads, and expected mail. But what if hackers learn about your typical vendors from your business website or if they read your job ads? Then they would send plausible e-mails and might refer to specific codes, like the number of your job ad.

Until recently I figured that only medium or larger companies would be subject to targeted attacks. One major Austrian telco was victim of a Denial of Service attacked and challenged to pay ransom. (They didn’t, and were able to deal with the attack successfully.)

But then I have encountered a new level of ransomware attacks – targeting very small Austrian businesses by sending ‘expected’ job applications via e-mail:

  • The subject line was Job application as [a job that had been advertised weeks ago at a major governmental job service platform]
  • It was written in flawless German, using typical job applicant’s lingo as you learn in trainings.
  • It was addressed to the personal e-mail of the employee dealing with applications, not the public ‘info@’ address of the business
  • There was no attachment – so malware filters could not have found anything suspicious – but only a link to a shared cloud folder (‘…as the attachments are too large…’) – run by a a legit European cloud company.
  • If you clicked the link (which you should not so unless you do this on a separate test-for-malware machine in a separate network) you saw a typical applicant’s photo and a second file – whose name translated to JobApplicationPDF.exe.

Suspicious features:

  • The EXE file should have triggered red lights. But it is not impossible that a job application creates a self-extracting archive, although I would compare that to wrapping your paper application in a box looking like a fake bomb.
  • Google’s Image Search showed that the photo has been stolen from a German photographer’s website – it was an example for a typical job applicant’s photo.
  • Both cloud and mail service used were less known ones. It has been reported that Dropbox had removed suspicious files so it seemed that attackers tuned to alternative services. (Both mail and cloud provider reacted quickly and sht down the suspicious accounts)
  • The e-mail did not contain a phone number or street address, just the pointer to the cloud store: Possible but weird as an applicant should be eager to encourage communications via all channels. There might be ‘normal’ issues with accessing a cloud store link (e.g. link falsely blocked by corporate firewall) – so the HR department should be able to call the applicant.
  • Googling the body text of the e-mail gave one result only – a new blog entry of an IT professional quoting it at full length. The subject line was personalized to industry sector and a specific job ad – but the bulk of the text was not.
  • The non-public e-mail address of the HR person was googleable as the job ad plus contact data appeared on a job platform in a different language and country, without the small company’s consent of course. So harvesting both e-mail address and job description automatically.

I also wonder if my Everything as a Service vision will provide a cure: More and more communication has been moved to messaging on social networks anyway – for convenience and avoiding false negative spam detection. E-Mail – powered by old SMTP protocol with tacked on security features, run on decentralized mail servers – is being replaced by messaging happening within a big monolithic block of a system like Facebook messaging. Some large employer already require their applications to submit their CVs using their web platforms, as well as large corporations demand that their suppliers use their billing platform instead of sending invoices per e-mail.

What needs to be avoided is downloading an executable file and executing it in an environment not controlled by security policies. A large cloud provider might have a better chance to enforce security, and viewing or processing an ‘attachment’ could happen in the provider’s environment. As an alternative all ‘our’ devices might be actually be part of a service and controlled more tightly by centrally set policies. Disclaimer: Not sure if I like that.

Iconic computer virus - from my very first small business website in 1997. Image credits mine.

(‘Computer virus’ – from my first website 1997. Credits mine)

 

All My Theories Have Been Wrong. Fortunately!

I apologize to Google. They still like my blog.

This blog’s numbers plummeted as per Webmaster Tools, here and here you find everything you never wanted to know about it. I finally figured that my blog was a victim of Google’s latest update Panda 4.1. Sites about ‘anything’ had suffered, and the Panda rollout matched the date of the onset of the decline.

Other things happened in autumn, too: I had displayed links to latest WordPress blog posts on my other websites, but my feed parser suddenly refused to work. The root cause was the gradual migration of all WP.com blogs and feeds to https:// only. Only elkement’s blog had been migrated at that time; our German blog’s feed was affected two months later.

Recently also the German blog started its descent in impressions and clicks, again two months after elkement’s blog. I pondered about https URLs again – the correlation was too compelling. Then suddenly the answer came to me:

!

!!

!!!

You need to add the https URL as an additional site in Webmaster Tools.

!!!

!!

!

It was that simple. All the traffic I missed was here all the time – tucked away in the statistics for https://elkement.wordpress.com. This also answers the question I posed in my last Google rant post: Why do I see more Search Engine referrers in WordPress stats than clicks in Webmaster Tools? I had just looked in the wrong place.

I had briefly considered the https thing last year but ruled it out as I misinterpreted Webmaster Tools – falsely believing that one entry for a site would cover both the http and the https version. These are the results for both URLs – treated like separate entities by Webmaster Tools:

Results for http : // elkement.wordpress.com  – abysmal:

(Edit: I cannot use a link here and have to add those weird blanks – otherwise WP will always convert both URL and text to https automatically even if the prefix is displayed as http in the editor.)

Google traffic for http version of this blogResults for https://elkement.wordpress.com – better by a factor of 100: Way more Google traffic for the https version of this blog URLPopular pages were the first to ‘move’ over to the https entry. This explains why my top page was missing first from http pages impressions – the book review which I assumed to have been penalized by Panda as an alleged cross-link scam. In full paranoia mode I was also concerned of my adding random Wikimedia images to my poetry.

But now I will do it again as I feel relieved. And relaxed – as this Panda. Giant panda01 960______________________________

You have read a post in my new category Make a Fool of Myself. (I tried to top the self-sabotaging effect of writing about my business website being hacked – as a so-called security expert.)

Yet the theory was all too compelling. I found numerous examples of small sites penalized by Panda in a weird way. See this discussion: A shop’s webmaster makes a product database with succinct descriptions available online and is penalized for ‘key word spamming’ – as his key words are part of each product name. Advice by SEO experts: Circumscribe your product names.

Legend has it that Panda was named after a Google engineer. I figured it was because the Panda is so choosy, insisting on bamboo eucalyptus (*), just as Google scrutinizes our sites more and more. (*) One more theory I got wrong, now edited! Thanks to commentator Cleo for pointing out the mistake.

Waging a Battle against Sinister Algorithms

I have felt a disturbance of the force.

As you might expect from a blog about anything, this one has a weird collection of unrelated top pages and posts. My WordPress Blog Stats tell me I am obviously an internet authority on: how rodents get into kitchen appliances, about the physics of a spinning toy, about the history of the first heat pump, and most recently about how to sniff router traffic. But all those posts and topics are eclipsed by the meteoric rise of the single most popular ever article, which was a review of a book on a subfield in theoretical physics. I am not linking this post or quoting its title for reasons you might understand in a minute.

Checking out Google Webmaster Tools the effect is even more pronounced. Some months ago this textbook review attracted by far the most Google search impressions and clicks. Looking at the data from the perspective of a bot it might appear as if my blog had been created just to promote that book. Which is, what I believe might actually had happened.

Concluding from historical versions of the book author’s website (on archive.org), the page impressions of my review started to surge when he put a backlink to my post on his page, some when in spring this year.

But then in autumn this happened.

Page impressions for this blog on Google Webmaster Tools, Sept to Dec.These are the impressions for searches from desktop computers (‘Web’), without image or mobile search. A page impression means that  the link had been displayed on Google Search Results pages to some user. The curve does not change much if I remove the filter for Web.

For this period of three months, that article I Shall Not Quote is the top page in terms of impressions, right after the blog’s default page. I wondered about the reason for this steep decline as I usually don’t see any trend within three months on any of my sites.

If I decrease the time slot to the past month that infamous post suddenly vanishes from the top posts:

Page impressions and top pages in the last monthIt was eradicated quickly – which can only be recognized when decreasing the time slot step-by-step. With a few days at the end of October / beginning of November the entry seems to have been erased from the list of impressions.

I sorted the list of results shown above by the name of the page, not by impressions. Since WordPress posts’ names are prefixed with dates you would expect to see any of your posts in that list somewhere, some of them of course with very slow scores. Actually, that list does include also obscure early posts from 2012 nobody ever clicks at.

The former top post, however, did not get a single impression anymore in the past month. I have highlighted the posts before and after in the list, and I have removed all filters for this one, thus also image and mobile search are taken into account. The post’s name started with /2013/12/22/:

Last month, top pages, recent top post missingChecking the status of indexed pages in total confirms that links have been recently removed:

Index status of this blogFor my other sites and blog this number is basically constant – as long as a website does not get hacked. As our business site actually has been a month ago. Yes, I only mention this in passing as I am less worried about that hack than about that mysterious penalizing of this blog.

I learned that your typical hack of a website is less spectacular that what hacker movies let you believe: If you are not a high-profile target, hacker-spammers leave your site intact, but place additional spammy pages with cross-links on your site to promote their links. You recognize this immediately by a surge of the number of URLs, of indexing activities, and – in case your hoster is as vigilant as mine – a peak in 404 not found errors after that spammy pages have been removed. This is the intermittent spike in spammy pages on our business page crawled by Google:

Crawl stats after hackI used all tools at my disposal to clean up the mess the hackers caused – those pages actually have been indexed already. It will take a while until things like ‘fake Gucci belts’ will be removed from our top content keywords, after I removed the links from the index by editing robots.txt, and using the Google URL removal tool and the URL parameters tool (the latter comes in handy as the spammy pages have been indexed with various query strings, that is: parameters).

I have expected the worst but Google have not penalized me for that intermittent link spam attack (yet?). Numbers are now back to normal after a peak in queries for those fake brand stuff:

Queries back to normal after clean-up.It was an awful lot of work to clean those URLs popping up again and again every day. I am willing to fight the sinister forces without too much whining. But Google’s harsh treatment of the post on this blog freaks me out. It is not only the blog post that was affected but also the pages for the tags, categories and archive entries. Nearly all of these pages – thus all the pages linking to the post – did not get a single impression anymore.

Google Webmaster Tools also tells me that the number of so-called Structured Data for this blog had been reduced to nearly zero:

Structured data on this blogStructured Data are useful for pages that show e.g. product reviews or recipes – anything that should have a pre-defined structure that might be presented according to that structure in Google search results, via nice formatted snippets. My home-grown websites do not use those, but the spammer-hackers had used such data in their link spam pages – so on our business site we saw a peak in structured data at the time of the hack.

Obviously WP blogs use those per design. Our German blog is based on the same WP theme – but the number of structured data there has been constant. So if anybody out there is using theme Twenty Eleven I would be happy to learn about your encounters with structured data.

I have read a lot: what I never wanted to know about search engine optimization. This also included hackers’ Black SEO. I recommend the book Spam Nation by renowned investigative reporter and IT security insider Brian Krebs, published recently. Whose page and book I will again not link.

What has happened? I can only speculate.

Spammers build networks of shady backlinks to promote their stuff. So common knowledge is of course that you should not buy links or create such network scams. Ironically, I have cross-linked all my own sites like hell for many years. Not for SEO purposes but in my eternal quest for organizing my stuff, keeping things separate, but adding the right pointers though, Raking the virtual Zen Garden etc. Never ever did this backfire. I was always concerned about the effect of my links and resources pages (links to other pages, mainly tech and science). Today my site radices.net which was once an early German predecessor of this blog is my big link dump – but still these massive link collections are not voted down by Google.

Maybe Google considers my posting and the physics book author’s website part of such a link scam. I have linked to the author’s page several times – to sample chapters, generously made available via download as PDFs, and the author linked back to me. I had refused to tie my blog to my Google+ account and claim ‘Google authorship’ so far as I don’t wanted to trade elkement for my real name on G+. Via Webmaster tools Google knows about all my domains but they might suspect I – a pseudo-anonymous elkement, using an @subversiv.at address on G+ – might also own the book author’s domain that I – diabolically smart – did not declare in Webmaster Tools.

As I said before, from a most objective perspective Google’s rationale might not be that unreasonable. I don’t write book reviews that often, my most recent were about The Year Without Pants and The Glass Cage. I rather write posts triggered by one idea in a book, maybe not even the main one. When I write about books I don’t use Amazon Affiliate marketing – as professional reviewers such as Brain Pickings or Farnam Street do. I write about unrelated topics. I might not match the expected pattern. This is amusing as long as only a blog is concerned but on principle it is similar as being interviewed by the FBI at an airport because your travel pattern just can’t be normal (as detailed in the book Bursts, on modelling human behaviour – a book I also sort of reviewed last year).

In short, I sometimes review and ‘promote’ books without any return on that. I simply don’t review books I don’t like as I think blogging should be fun. Maybe in an age of gamified reviews and fake forum posts with spammy signatures Google simply doesn’t buy into that. I sympathize. I learned that forums websites shod add a nofollow tag to any hyperlinks users post so that Google will now downvote the link targets. So links in discussion groups are considered spammy per se and you need to do something about it so that they don’t hurt what you – as a forum user – are probably trying to discuss or recommend in good faith. I already live in fear that those links some tinkerers set in DIYer’s forums (linking to our business site or my posts on our heating system) will be considered paid link spam.

However, I cannot explain why I can find my book review post on Google (thus generating an impression) when searching for site:[URL of the post]. Perhaps consolidation takes time. Perhaps there is hope. I even see the post when I use Tor Browser and a foreign IP address so this is not related to my preferences as a logged on Google user. But if there isn’t a glitch in Webmaster Tools, no other typical searcher encounters this impression. I am aware of the tool for disavowing URLs but I don’t want to report a perfectly valid backlink. In addition, that backlink from the author’s site does not even show up in the list of external backlinks which is another enigma.

I know that this seems to be an obsession with a first world problem: This was an post on a topic I don’t claim expertise or that I don’t consider strategically important. But whatever happens to this blog could happen to other sites I am more concerned about, business-wise. So I hope if is just a bug and/or Google Bots will read this post and will release my link. Just in case I mentioned your book or blog here, even if indirectly, please don’t backlink.

Perhaps Google did not like my ranting about encrypted search terms, not available to the search term poet. I dared to display the Bing logo back then. Which I will do again now as:

  • Bing tells me that the infamous post generates impressions and clicks
  • Bing recognizes the backlink
  • The number of indexed pages is increasing gradually with time.
  • And Bing did not index the spammy pages in the brief period they were on our hacked website.

Bing logo (2013)Update 2014-12-23 – it actually happened twice:

Analyzing the impressions from the last day I realize that Google has also treated my physics resources page Physics Books on the Bedside Table. Page impressions dropped and now that page which was the top one )after the review had plummeted) is gone, too. I had already considered to move this page to my site that hosts all those list of links (without issues, so far): radices.net, and I will complete this migration in a minute. Now of course Google might think I, the link spammer, am frantically moving on to another site.

Update 2014-12-24 – now at least results are consistent:

I cannot see my own review post anymore when I search for the title of the book. So finally the results from Webmaster Tools are in line with my tests.

Update 2015-01-23 – totally embarrassing final statement on this:

WordPress has migrated their hosted blogs to https only. All my traffic was hiding in the statistics for the https version which has to be added in Google Webmaster Tools as a separate website.