Stephen Nimmo

Energy Trading, Risk Management and Software Development

Month: April 2014

Identity Crisis

A few months ago, I was having a conversation over coffee with an IT operations manager with a very large company about their lack of standards in managing deployments and dependency on manual processes. As these conversations usually go, what started as a very targeted discussion about optimization of a single aspect of software development turned into a fantastic philosophical journey across the entire spectrum of issues, from everything to managing executive expectations and what software development will look like ten years from now. Keep in mind, this is a multinational company with hundreds of programmers, ops, and IT support employees managing millions of lines of both completely custom software packages but also millions of lines of code related to customization of vendor software packages with software budgets in the 100’s of millions. That’s when the statement was dropped:

“We aren’t a software company.”happygilmore

Much like Happy Gilmore would never admit he’s a golfer, but rather a hockey player, there are so many large companies out there having this identity crisis and refusing to acknowledge reality. If the product produced is not software, this does not exempt the operations from failing to adopt good processes and standards related to software. For example, if an employee of a widget manufacturing company was asked what the company did, the answer would be making widgets. In the same line, if you asked their CFO if the accounting department followed GAAP standards, the answer would certainly be “yes”. However, when you walk into the IT departments and ask the developers or software managers whether or not they do things like unit testing, regression testing or work towards automated security controls, you would be very surprised how often the answer is no. Why is there a perceived difference?

The main difference is visibility. Auditors don’t come in to make sure good software development standards are in place, but they do come in and pour through the ledgers to ensure accounting is happening according to standards. If standards aren’t in place for your accounting department, the pain can be immense, especially for public companies. However, the reality of following accounting standards is not simply because people will be checking. The standards are in place because of the collective experience in business of understanding that following standards is a good thing from a productivity standpoint. In the long term, standards create efficiencies and reduce costly errors. These same companies choose to implement rigid standards in one aspect of their business, yet leave other departments with hundreds of employees and millions of budgeted dollars free of standards.

Yet leaving the basic standards, like unit testing and automated builds, out of the software development processes will be visible, but just not in the same way. That production outage last week because of a last minute code change? That was because there weren’t any unit tests. That year long million dollar software upgrade that turned into a 3 million dollar two year effort? That’s because their weren’t any requirement standards, performance testing or architectural standards in place. The lack of standards in software development processes are very visible and are equally as painful as an accounting department not following GAAP, but the difference is the inability to draw the correct conclusions and root cause. The employees ultimately responsible will attribute the outage to the bad code or the poor requirements, but the answer is actually a lack of effective standards and best practices.

Just as GAAP standards won’t fix every accounting woe, great software development practices won’t catch every issue but it sure will stop a lot of them. So the next time there is a big production outage, ask a different set of questions:

  • Can you provide me a list of all the unit tests that were run for this?
  • Did the regression tests fail for this release?
  • When the rollback scripts were run, why weren’t they effective?

The answers might surprise you.

Choosing an Enterprise Trading Communication Protocol

In building modular software for energy trading organizations, each module is designed to perform a particular set of tasks, usually correlated to the offices: front, mid and back. The main purpose of creating modular, independent systems is the ability to target a particular set of features without having to account for everything upstream and downstream the IT value chain. Building software for exchange order management requires a different set of behaviors and characteristics than creating invoices for your residential power customers. Building the entire value chain of an energy trading shop into a single application may work for startups and limited scope, but will quickly decay in the face of  scale. To create scalable systems, modularity is required but with modularity comes the requirement of data interfaces.giff30eNVU7z7

Building data interfaces and APIs is an important part of building these types of software. For many different reasons, architects and developers involved will spin their wheels creating a new, completely proprietary way to describe the nouns of their system, such as trades or positions, instead of using one of the multiple, proven protocols already available for free. Usually the arguments against using the industry standard protocols internally are around perceived unique requirements or considering these fully baked protocols to be overkill. However, the long term implications of building a proprietary communications protocol will create an ongoing and expensive technical debt within the organization as the scope of the protocol expands and disagreements between system owners about the “right” way to model a particular interaction. It’s much easier to stand on the shoulders of industry giants and leverage years of trading experience by using proven and widely adopted protocols, while also creating an externally managed mediation point for disagreements regarding how and what gets communicated.

Right now, there are three main standards for modeling communication for energy trading. These standards have gotten quite a face lift in the past 4 years, due to their expansion related to changes created by Dodd-Frank and other regulatory legislation. These standards includes both format but also some standardization of content, allowing reuse of enumerations to describe certain trade values, such as settlement types or asset classes.

  • FIX / FIXML – Financial Information eXchange – This protocol is the most mature, as it was born from equities trading and other models that are purely financially settled. However in the recent years, the protocol has expanded into several different venues of commodities trading including being the protocol of choice for ICE’s Trade Capture and Order Routing platforms, as well as almost all of CME’s entire portfolio of electronic connectivity. This model is more normalized, in the sense of having multiple reference points to different related data. Instrument definitions could be communicated using different timing or venues while trades may just refer to those instruments using codes, allowing for faster communication.
  • FpML – Financial Product Markup Language – This protocol was recently used by DTCC to implement their SDR and while it’s much more robust in it’s implementation, it has quite a bit of a learning curve. The communications structure lends itself more toward each unit being a complete and total description of a transaction, duplicating standard data across transactions such as product or counterparty information. This protocol is XML based and much more verbose, but allows finer grained controls around things like date logic for instruments. The protocol also has multiple versions, tailored to specific needs of the organization.
  • CpML – Similar to FpML, this protocol is named directly for commodities descriptions and although it’s more widely adopted across the pond in Europe around EFET, it’s value holds for US based organizations.

But picking a standard protocol is only the first step. There are some additional concerns one should keep in mind when implementing these protocols to reduce the amount of headaches later.

  • Treat internal data consumers exactly the same way you would treat external data consumers – like customers.
  • Create a standards committee that has power to define and model the system interactions, but make sure the committee can make quick decisions through informal processes.
  • Always force any requests to extend the model for proprietary requirements through a rigorous standard to ensure those designing the interfaces aren’t simply doing the easy thing, rather than the right thing. I am always amazed about how quickly an organization can throw out the need for technical standardization when faced with a small decision to adjust an existing business process.
  • Broadcast the event generically, allowing listeners to determine what they need and allowing them to throw away what they don’t need. All else being equal, it’s easier to broadcast everything rather than open up individual data elements one by one.
  • Create and use common patterns for interfacing. Having one system be request-response and another be messaging based will create just as many issues as proprietary XML protocols.
  • As always, make the right thing to do the easy thing to do for developers. Make an investment in tooling and training for the stakeholders involved to ensure success.

Technical Debt Snowball

Most Americans are familiar with a personal finance strategy colloquially known as “Dave Ramsey“. The course is actually called My Total Money Makeover and is a regimented and proven strategy for getting your debt under control and moving into a more financially sound position. The foundation of the course relates directly to a single set of tactics collectively referred to as the “Debt Snowball”. Using this tactic, people line up all their outstanding debts from smallest to largest and begin to pay extra money into the smallest debt first, while paying only the absolute minimum to all the other debts. Once the smallest debt is paid, then take the amount you were paying on the first debt, add it to the minimum amount on the second debt and start paying on the second debt. Like a snowball rolling down the hill, with each debt conquered the total payment grows bigger and gives an ever increasing leverage to knock out the larger debts. There is also a huge psychological boost for the average person by starting with the small debts first in creating momentum and quick wins, which are imperative for those who have trouble forming habits without positive feedback. From a purely financial sense, it’s actually better to pay off the highest interest debt first, but the differences in percentages may not justify the loss of momentum and positive feedback needed for the average person struggling with debt issues to be successful.

Surprisingly, when you start the process, there is a first step that doesn’t sound intuitive, but is absolutely imperative for those people living in a situation of crippling debt. The first step is the emergency fund. Before any debt is paid, the individual is highly encouraged to create an emergency fund of $1000 to create a cushion for any unexpected debts or issues that may arise over the course of your journey. The reasoning is this: People living under the immense pressure of not knowing if they will be able to pay their electric bills or if a car repair might cause them to lose their job are simply unable to make good decisions. The pressure is so intense at that point, that most people’s ability to make strategic decisions get short circuited and they only think about how to relieve the pressure, even temporarily and by any means necessary. In addition, people living in such conditions have a strong tendency to want to escape the pressure in other ways, and the escapes tend to revolve around more bad money decisions, such as impulse purchases to make them happy or throwing money at guilty pleasures like food or drink. The first step of Dave Ramsey is to relieve this pressure and create a space for strategic thinking.

In any business with large IT operations, there is a very real concept of technical debt where the organization is making technical decisions in the present that may require some level of effort in the future to “correct” the decision. From an enterprise sense, even some of the major product decisions are creating technical debt they may not even be aware of. To use an example, if you choose to use Microsoft Outlook for your company’s email, the decision is creating technical debt related to upgrade paths – the organization will eventually be forced to upgrade or change platforms. For custom application development, technical debt has a direct correlation to certain decisions such as choosing not to automate your regression testing or not creating unit tests. These decisions could be due to budget constraints and simply not having enough staff to perform the activities, but you’ll end up paying for it later in either time or lost productivity.

When your IT organization creates too much technical debt, either through their own lack of discipline or through direct pressures placed on them by decision makers, it will eventually reach a point of becoming crippled operationally. The staff gets so bogged down with production bugs and performing low value activities such as manually testing code, no new features or business can happen. The IT staff start living with the illusion of choice – do we spend out time today getting our CRM back up so we can conduct normal business, or do we work on the new website designed to bring in new customers? There is no choice there. When things get this bad, there will usually be some sort of change that will allow for some movement to be made on paying down the debt. This could come in the form of cancelled projects, contractors for hired muscle or even outsourcing an entire system replatforming if the debt is too large (i.e. IT bankruptcy).

http://xkcd.com/1205/

http://xkcd.com/1205/

When the company becomes aware of the issues, the ensuing prioritization discussions begin. How do we pay down our debt? Some may suggest using a purely financial model – eschewing the aspects of momentum for prioritizing work for the issues having the largest impact. However, using this type of prioritization means the organization will miss out on a key feature to the snowball – it needs to learn how to pay off technical debt and not create new debt. Taking on a huge new project in an environment of already crippling technical debt pushes off the much needed positive feedback needed to reinforce what is being done is good for the company. After six months, the key stakeholders may forget about why it’s important and will refocus efforts on daily firefighting. By prioritizing some smaller and easier automation projects, monentum and feedback loops can be created, resulting in a better chance of long term success incorporating good software delivery practices as a part of normal culture.

The organization should start the attack on technical debt using continuous delivery concepts. First, it needs to create that emergency fund. For the most part, the quickest way to get your organization out of panic mode (i.e. emergency production support) is through the creation of automated regression testing. Being able to fix bugs and not introduce new bugs is the debt snowball equivalent of knowing your lights won’t be turned off. It creates a cushion of confidence that developers can fix current production issues without introducing new debt. This is also the start of the technical debt snowball’s momentum, as the IT staff can begin to start fixing actual bugs using the time they were previously spending identifying and testing old support issues. Every time a new automated test that saves 5 minutes a day is created, 6 days of slack is created. Think about 100 new automated regression tests, each saving 5 minutes a day and you’ll see the snowball. Automated regression testing should always be the starting point for any debt paydowns – even if you are replatforming! Once the automated regression snowball starts knocking out those small debts, then the time saved can be used for tackling some of the larger pieces of automation such as creating automated build and deployment scripts. This will create the cushion needed for your ops team to start monitoring the services in a more proactive manner, leading to additional uptime as the ops folks can tackle issues before they become outages.

The cushion created by automation can then create opportunity for the organization to tackle larger strategic issues. To be clear, executives don’t want to talk about a 5 year vision when core systems have been down three times this week. But once the snowball starts rolling, it’s hard to stop it. The momentum created from continuous delivery creates a fervor from the business to create more opportunities for optimization of processes, which will then allow the IT staff to work on projects with actual business value. Everyone in the organization begins to recognize the need to pay down debt regularly and give the developers and operations more budgeted time to perform those activities needed to reduce technical debt, such as writing automated tests and refactoring bad code. If you are current working in a environment where the daily norm is daily dire drills and emergency releases, it can be excruciating and ultimately the employers will start losing key staff. Ultimately, taking care of technical debt is something that will be addressed. The only question is how painful it’s going to be.

© 2017 Stephen Nimmo

Theme by Anders NorenUp ↑