Volume 34 Number 11
3 Things: A Few Last Words on Software
By Dino Esposito | November 2019
For more than two decades, I’ve worked to stay true to the mission of this column and to be constantly on the cutting edge of development. Now that this endeavor is coming to its end, I hope to leave you with insight that can help keep you on the cutting edge in the months and years to come. I won’t be covering any of the new amazing features coming to ASP.NET Core 3, .NET 5, Blazor, ML.NET or gRPC. Nor will I hazard any guesses as to what the future may bring to the IT and software development world in the next five to 10 years. I’m old enough to heed Socrates’ ageless motto: “One thing only I know, and that is that I know nothing.”
So in this, my final column, I’ll share my perspective about three topics and their related buzzwords that are of crucial relevance today.
The Case for Agile in Artificial Intelligence
Everyone in the industry is using artificial intelligence (AI) as a bullhorn to reach and impress the widest possible audience, and emphasizing AI as a tool that can solve any IT issue and streamline any convoluted process. This is all marketing hype.
At its core, AI isn’t magic—it’s data crunching, elbow grease and software acumen. Companies don’t need AI, per se, they need end-to-end solutions to their concrete business problems and measurable results for their efforts. AI, and more specifically machine learning, is the most powerful tool we have to provide a new generation of end-to-end, tailormade solutions, at a reasonable cost.
The problem is, doing AI in the real world is much more frustrating than people think. For complex real-world problems (say, production forecasts in renewable energy) you don’t end up using a canonically trained model. Instead, you work with software artifacts that are only minimally trained (on a small number of variables) and analytically process frequently changing live data such as weather forecasts.
Another great example is fault prediction or predictive maintenance. If you search around, it may seem a problem good for an exercise—you can find plenty of short articles about it on Web sites. The thing is, fault prediction requires a Long Short Term Memory neural network, which is one of the trickiest flavors of neural networks to work with. Still, companies are actively looking into solving the business problem of reducing downtime and optimizing maintenance with more analytical, half-human, half-machine forms of predictions.
All of this is to say that AI is really just a newer and fancier name for problem solving, just with more advanced tools.
Figure 1 The Tight Coupling of Software and Data in Artificial Intelligence
In companies, the data science team is often seen as the panacea team and charged with too many (or too few) responsibilities. The reality is that too often a waterfall-style model is established to rule the collaboration between data science and software. One team delivers the trained model; another team mounts it into the host application. Sometimes it works with the live data of the real world, but, more often, it doesn’t. I’ve seen algorithms needing input that no client application could realistically provide. I’ve seen algorithms trained on data taken from one wind farm and tested for acceptance using data from another wind farm using different turbines and subject to different orography. The trained algorithm needs to be an integrated part of the software solution.
For serious AI, you need an agile process that positions the data science and software teams to work as a single team, or at least to work together closely. The importance of data/software collaboration is the reason I suggest looking into ML.NET rather than committing to Python for general machine learning development. The rich ecosystem of coded algorithms and libraries for data manipulation in ML.NET mean that developers can leverage the power of SQL Server, Entity Framework and the .NET Framework to work with data. There’s no reason for .NET shops to use Python—thinking so is a clear sign that you may be missing some crucial points about AI.
Event Sourcing over Blockchain
More than a decade ago, blockchain was devised to function in the context of a very specific scenario: solving the double spending (DS) problem in a deliberately decentralized system. In a nutshell, in a digital cash system, DS is the problem of preventing digital money (which, ultimately, are just files) from being duplicated or tampered with. A physical coin can be spent once at a time, and a person can only spend it only if he or she has legitimately received the coin from another individual. Blockchain is the software platform devised to avoid DS in the context of the Bitcoin cryptocurrency. (By the way, DS is most decidedly not a trivial problem to solve.)
As Bitcoin gained traction, the term blockchain gained ascendancy. But from a development perspective, the various cloud services offering blockchain networks are ultimately providing storage. So what’s the difference between blockchain and other storage platforms such as relational databases? One is decentralized governance and another is stronger resistance to tampering. A third aspect is the event-based nature of the blockchain storage.
My personal thinking is that the inherent ability of a blockchain network to track every single transaction is the aspect that makes it so attractive and intriguing. Putting something on the blockchain records that a thing happened, and when and how that thing happened, can be easily tracked back. It’s as if the coin in your pocket could let you inspect the entire list of handovers it was subject to since it was released from the mint. That’s what blockchain was made for—maintaining the history of transactions for all individual Bitcoins with the certainty that no transaction or past facts could be altered. Therefore, blockchain makes perfect sense in the context of Bitcoin and cryptocurrencies.
So in what way can blockchain help (some might say disrupt) the general industry?
The only reasonable value I see is for countrywide, or even transnational, networks that store shared and continuously edited information. Who manages these networks, though? In the end, it’s always about storing data at some location (centralized or not) and having someone review and approve it.
The issue goes far beyond the software aspect of storage. If we only talk about traceability of business facts, then event sourcing is a much more concrete alternative. Event sourcing is an approach to building the data layer of software applications in which all the changes to the application state are stored as a sequence of events and each change is treated individually. You have a record (or an event) when an order is created, another event when the order is updated the first time, yet another when it’s updated the second time, and so forth. To get the current state of the order, you must replay the entire list of events, similar to how you determine the balance of a bank account. It’s a number, but it results from the entire list of operations you made over time.
Event sourcing frameworks work in an append-only mode, with the deletion of objects tracked by adding an event that describes when and why the order was deleted. These frameworks typically don’t fully support blockchain key features such as decentralization, voting and anti-tampering measures. But the idea of tracking everything also can be achieved via event sourcing, and from event sourcing with some extensions you can get very close to blockchain. Hence, the focus returns on the actual problem you want to solve rather than how you’d like to solve it.
Over Engineering: Just Say No
We’ve seen modern software giants court complexity to adapt to rapid growth. Facebook adopted polyglot persistence to serve its hundreds of millions of active users. Amazon switched to Web services, Netflix to microservices and Google to gRPC over REST. Alas, I fear we may be learning the wrong lessons from these success stories.
For instance, today it seems almost inevitable that any Web application must be designed around microservices. I’m old enough to have worked with Distributed COM and I recall the first (and only) law of Distributed Programming: Do Not Distribute It. A distributed architecture has pros and cons and it’s never obvious when the pros will overtake the cons. Working within a network brings new issues such as latency and synchronization, and amplifies others such as debugging and security. In a nutshell, everything is more complex.
Complexity is justifiable when it brings a tangible return. But in my experience, starting out with a complex design because it was used successfully in another scenario has never proved wise. Instead, you should constantly monitor the effectiveness of your design and add complexity only where and when necessary. Over-engineering is always costly and risky.
A Farewell Note
I’ve been asked many times how I can keep up with quickly changing software technologies and grasp the issues related to development problems without writing enterprise code around the clock. My secret is I try to get at the foundation of things and think of software as the mirror of business processes designed by humans. Deep understanding of the business smooths the task of writing software and makes it kind of mechanical work. Deep understanding of technologies makes it easy to see if it fits in a specific business problem.
To a certain extent, software is a physical thing and is subject to familiar physical laws, such as entropy. At the same time, software is a human thing and, as such, is influenced by some context-specific form of empathy—the ability to understand. While we all know about entropy and its impact on code over time, I’m more interested in empathy, which expresses the ability of software to penetrate the internal mechanics of processes. Empathy impacts the quality of software and can even reduce entropy in the long run.
The power of understanding things—call it software empathy—is what I leverage in my every day activities and that I tried to illustrate by example in my columns. I ended up writing because I had always wanted to be a writer, and I found that being a coder was one way to be a writer. So I put passion and empathy into it, the same way I worked to push passion and empathy into my code. Did it work? The fact that this column has thrived for more than 20 years is a clue!
I wish to end with wisdom from some great men, just adapted to software. Let’s call them my messages for posterity:
- Keep it simple, as simple as possible. Otherwise it’s over-engineered. (Adapted from Albert Einstein)
- Don’t forget to pay your technical debt. (Adapted from the reported last words of Socrates)
- Perhaps a problem did not want to be solved so much as to be understood. (Adapted from George Orwell)
Stay in touch at youbiquitous.net.
Dino Esposito has authored more than 20 books and 1,000-plus articles in his 25-year career. Author of “The Sabbatical Break,” a theatrical-style show, Esposito is busy writing software for a greener world as the digital strategist at BaxEnergy. Follow him on Twitter: @despos.