Stephen Nimmo

Energy Trading, Risk Management and Software Development

Page 2 of 4

Technical Debt Snowball

Most Americans are familiar with a personal finance strategy colloquially known as “Dave Ramsey“. The course is actually called My Total Money Makeover and is a regimented and proven strategy for getting your debt under control and moving into a more financially sound position. The foundation of the course relates directly to a single set of tactics collectively referred to as the “Debt Snowball”. Using this tactic, people line up all their outstanding debts from smallest to largest and begin to pay extra money into the smallest debt first, while paying only the absolute minimum to all the other debts. Once the smallest debt is paid, then take the amount you were paying on the first debt, add it to the minimum amount on the second debt and start paying on the second debt. Like a snowball rolling down the hill, with each debt conquered the total payment grows bigger and gives an ever increasing leverage to knock out the larger debts. There is also a huge psychological boost for the average person by starting with the small debts first in creating momentum and quick wins, which are imperative for those who have trouble forming habits without positive feedback. From a purely financial sense, it’s actually better to pay off the highest interest debt first, but the differences in percentages may not justify the loss of momentum and positive feedback needed for the average person struggling with debt issues to be successful.

Surprisingly, when you start the process, there is a first step that doesn’t sound intuitive, but is absolutely imperative for those people living in a situation of crippling debt. The first step is the emergency fund. Before any debt is paid, the individual is highly encouraged to create an emergency fund of $1000 to create a cushion for any unexpected debts or issues that may arise over the course of your journey. The reasoning is this: People living under the immense pressure of not knowing if they will be able to pay their electric bills or if a car repair might cause them to lose their job are simply unable to make good decisions. The pressure is so intense at that point, that most people’s ability to make strategic decisions get short circuited and they only think about how to relieve the pressure, even temporarily and by any means necessary. In addition, people living in such conditions have a strong tendency to want to escape the pressure in other ways, and the escapes tend to revolve around more bad money decisions, such as impulse purchases to make them happy or throwing money at guilty pleasures like food or drink. The first step of Dave Ramsey is to relieve this pressure and create a space for strategic thinking.

In any business with large IT operations, there is a very real concept of technical debt where the organization is making technical decisions in the present that may require some level of effort in the future to “correct” the decision. From an enterprise sense, even some of the major product decisions are creating technical debt they may not even be aware of. To use an example, if you choose to use Microsoft Outlook for your company’s email, the decision is creating technical debt related to upgrade paths – the organization will eventually be forced to upgrade or change platforms. For custom application development, technical debt has a direct correlation to certain decisions such as choosing not to automate your regression testing or not creating unit tests. These decisions could be due to budget constraints and simply not having enough staff to perform the activities, but you’ll end up paying for it later in either time or lost productivity.

When your IT organization creates too much technical debt, either through their own lack of discipline or through direct pressures placed on them by decision makers, it will eventually reach a point of becoming crippled operationally. The staff gets so bogged down with production bugs and performing low value activities such as manually testing code, no new features or business can happen. The IT staff start living with the illusion of choice – do we spend out time today getting our CRM back up so we can conduct normal business, or do we work on the new website designed to bring in new customers? There is no choice there. When things get this bad, there will usually be some sort of change that will allow for some movement to be made on paying down the debt. This could come in the form of cancelled projects, contractors for hired muscle or even outsourcing an entire system replatforming if the debt is too large (i.e. IT bankruptcy).

http://xkcd.com/1205/

http://xkcd.com/1205/

When the company becomes aware of the issues, the ensuing prioritization discussions begin. How do we pay down our debt? Some may suggest using a purely financial model – eschewing the aspects of momentum for prioritizing work for the issues having the largest impact. However, using this type of prioritization means the organization will miss out on a key feature to the snowball – it needs to learn how to pay off technical debt and not create new debt. Taking on a huge new project in an environment of already crippling technical debt pushes off the much needed positive feedback needed to reinforce what is being done is good for the company. After six months, the key stakeholders may forget about why it’s important and will refocus efforts on daily firefighting. By prioritizing some smaller and easier automation projects, monentum and feedback loops can be created, resulting in a better chance of long term success incorporating good software delivery practices as a part of normal culture.

The organization should start the attack on technical debt using continuous delivery concepts. First, it needs to create that emergency fund. For the most part, the quickest way to get your organization out of panic mode (i.e. emergency production support) is through the creation of automated regression testing. Being able to fix bugs and not introduce new bugs is the debt snowball equivalent of knowing your lights won’t be turned off. It creates a cushion of confidence that developers can fix current production issues without introducing new debt. This is also the start of the technical debt snowball’s momentum, as the IT staff can begin to start fixing actual bugs using the time they were previously spending identifying and testing old support issues. Every time a new automated test that saves 5 minutes a day is created, 6 days of slack is created. Think about 100 new automated regression tests, each saving 5 minutes a day and you’ll see the snowball. Automated regression testing should always be the starting point for any debt paydowns – even if you are replatforming! Once the automated regression snowball starts knocking out those small debts, then the time saved can be used for tackling some of the larger pieces of automation such as creating automated build and deployment scripts. This will create the cushion needed for your ops team to start monitoring the services in a more proactive manner, leading to additional uptime as the ops folks can tackle issues before they become outages.

The cushion created by automation can then create opportunity for the organization to tackle larger strategic issues. To be clear, executives don’t want to talk about a 5 year vision when core systems have been down three times this week. But once the snowball starts rolling, it’s hard to stop it. The momentum created from continuous delivery creates a fervor from the business to create more opportunities for optimization of processes, which will then allow the IT staff to work on projects with actual business value. Everyone in the organization begins to recognize the need to pay down debt regularly and give the developers and operations more budgeted time to perform those activities needed to reduce technical debt, such as writing automated tests and refactoring bad code. If you are current working in a environment where the daily norm is daily dire drills and emergency releases, it can be excruciating and ultimately the employers will start losing key staff. Ultimately, taking care of technical debt is something that will be addressed. The only question is how painful it’s going to be.

Next Phase of SDR Reporting – Reconciliation

It’s now almost six months since the final deadline for compliance with CFTC Title VII Dodd Frank regulations requiring Swap Data Repository (SDR) reporting. With the proverbial dust settled and the holiday season behind, commodities trading compliance groups across the country are engaging in the next phase of Dodd-Frank implementations, including the upcoming position limits rules. However, some market participants are finding new requirements for already implemented rules springing to the forefront of their backlogs.

To understand the new requirements, it is necessary to understand the landscape in which CFTC Title VII Part 43 (Real-Time Public Reporting) and Part 45 (Swap Data Recordkeeping and Reporting Requirements) were implemented, in respect to commodities. There was also quite a bit of volatility to the requirements themselves, as many different legal interpretations regarding the exact meaning of phases such as “swap” were left in limbo.  In what is arguably the most complicating implementation decision, the data standards for Part 45 included a rule permitting an SDR to allow those reporting data to it to use any data standard acceptable to the SDR. The effects of this were felt almost immediately as market participants could not begin work on delivery of the data until the SDRs committed to a format. While there were some basic guidelines presented in the rule appendices, the reality of the implementation ended up being much more complicated an endeavor.

The SDR vendor situation also played a huge role in the current situation. The first SDR approved for commodities was ICE’s SDR, Trade Vault. However, it was only provisionally approved in late June 2012. At that point, Part 45 rules included a mid-January 2013 deadline for reporting of “other” derivatives classes, which included commodities. It was seven months before the deadline and market participants participating in commodities trading only had one SDR available – in Beta. The only other SDR vendor on the horizon for commodities was the DTCC Data Repository (DDR), but their request to operate for commodities derivatives was still pending. What was interesting was the fact DDR was already operating in the credit, equity, interest rate and foreign exchange derivatives markets and due to their alignments on Wall Street with those existing swap dealers in other lines of business; DTCC had an overwhelming slice of the market share on SDR reporting for those asset classes.

However, the commodities market participants responded enthusiastically to ICE’s Trade Vault based on several compelling system features. First, the product’s interface and workflows piggybacked directly on its very popular electronic confirmation system, eConfirm. By leveraging the same interface, many commodities market participants saw a huge advantage by limiting the DFA Part 43 and 45 implementations to supplementing already existing confirmation processes with the new DFA-prescribed data fields, such as execution timestamp. In addition to the leveraging of the interfaces, ICE created some really valuable services for its customer, both by documenting which instrument types were reportable under the law, but also allowing a single message submission to be able to comply with both real-time and PET reporting requirements. DTCC’s approval for DDR to handle commodities in the U.S. didn’t come until late 2012, which was too late for most of the market participants – except the ones who already were participating in DDR in other asset classes.

With the CFTC pushing deadlines out until mid 2013, the market had some time to catch up and finish their implementations, but the delay also created some interesting data fragmentation issues.  The Non-SD/Non-MSP market participants, which are sometimes referred to as “end-users”, largely ended up using ICE’s Trade Vault as their SDR. However, a large swath of swap dealers ended up with DTCC’s Data Repository and it’s not hard to see why. These commodities swap dealers were mostly big banks or very large financial institutions engaged in multi-asset class derivatives dealing in addition to many other services, most of which communicated with DTCC for other services, such as securities clearing and settlements. The larger financial institutions were also already with the communication protocol used by DDR, the Financial Products Markup Language (FpML), which was used extensively in other asset classes. This split creates an interesting set of issues for the market because of another subtle nuance to the DFA rules regarding reporting – the participant who is responsible for reporting gets to pick the destination SDR. For a market participant using ICE’s Trade Vault as it’s only SDR, executing a swap on an electronic platform (SEF or DCM) or with a swap dealer using DDR creates a situation where data reported is not “seen” by the end-user’s systems. When a third SDR is added to the market in the form of CME’s Repository Service, the reporting situation becomes even more complicated and difficult.

As market participants enter the New Year, they will be faced with prioritization of a new set of reconciliation burdens from SDR reporting. While position limits and other DFA rules may take a front seat, the entire commodities market will be left with very few market participants able to verify and reconcile the data being reporting on its behalf to the CFTC. Until the SDR data harmonization rules are enforced regulatory reporting requirements will continue to stay volatile, complex and costly due to this data fragmentation issue.

CFTC Will Hold an Open Meeting to Consider Proposals on Position Limits

20131104-205319.jpg

CFTC will be holding an open meeting to discuss and consider proposals on the position limits rules associated with the Dodd Frank Act. To listen in on the entire proceedings, you can either call in or subscribe to their webcast. Details are here.

The biggest part of the legislation pertains to the spot month limits. Per the FAQ:

“Spot-month position limits will be set at 25% of deliverable supply for a given commodity, with a conditional spot- month limit of five times that amount for entities with positions exclusively in cash-settled contracts.
Non-spot-month position limits (aggregate single-month and all-months-combined limits that would apply across classes, as well as single-month and all-months-combined position limits separately for futures and swaps) will be set for each referenced contract at 10 percent of open interest in that contract up to the first 25,000 contracts, and 2.5 percent thereafter.”

In addition to the limits themselves, there is also extensive rules around position aggregations, trying to reduce the loopholes created by spreading position to different legal entities with common ownership.

At the end of the day, the CFTC is trying to remove rampant speculation causing additional volatility in the near month markets. However, the commission may be going too far, reducing the benefits of speculative positions as a way for physical market participants to offload price risk. Let’s hope their findings create a healthy market response and increase liquidity.

Value Based Feature Prioritization

When someone asks me what I do, my answer is fairly simple. I try to influence my company to do the highest value things first.

The classic product manager is tasked with deciding what features and enhancements will make the product better. There is a ton of work and time spent simply identifying features and writing descriptions of the processes providing value to your organization. The examples can get fairly complicated, mostly in describing the assumptions of the state of the systems and data prior to the start of the process, as well as documenting the alternate cases based on different workflows. These descriptions can be done in different ways depending on your management style, but most product managers write user stories, user case documents and there are plenty of product managers out there still writing requirements documents.

SprintBacklog

Writing use cases and user stories takes up about 70% of my daily time but the act of simply creating and formulating this process documentation is not where product managers provide value to the organization. A good product manager delivers the highest value features first and determining the value of features to the organization is key to delivering software and projects in a productive manner. Without some metrics around valuation of features, it’s guessing. There are numerous ways to determine feature value. Here are some of my favorites.

Operational Expense Reduction

To use a simple example, take a piece of software that requires your team of DBAs to provide ad-hoc support to the tune of about 4 hours every week. Take that cost and multiply it by 52 and you have a yearly cost (~200 hours a year at $100/hour) and a rough value, around 20k. Using a simple case like this does not do justice to the fact small software changes and process updates could reduce thousands of people’s workload by something as simple as 30 minutes a day. Take the same piece of software and add in a feature reducing the average user’s work by 30 minutes a day. Let’s say you have 1000 users. That’s a reduction of 500 hours a day! At $20/hour, the feature is potentially worth $2.5 million.

Increased Revenue

“If your software had that feature, we’d buy it”. Usually this is a statement heard in almost any sales cycle. Sometimes it’s a ploy to start negotiations with leverage on the buyer’s side, especially if there are other products on the market with the desired feature. There are other times when the statement is used to begin the judgement on the availability of the software provider to respond to your organization’s needs. Quite frankly, there is a reason for the stereotype of having a highly responsive organization who disappears after the check clears and watching the organizational behavior prior to the sale is smart for any buyer. Finally, there is an actual legitimate request for the feature.

How do you value a proposed feature for increasing revenue? Obviously, your company has some metrics around profitability of similar clients or past sales, so depending on the size of the sale (one license versus 1,000 licenses), then the feature value could be enough to push the priority up for this one sale to close. Most of the time, that’s not the case because it’s not only about the feature to the single potential client, but the thousands of other clients in your target segment. If a feature could add 200 new clients, then do the math on profitability per new client to determine value.

However, be wary of two additional realities of using this as a valuation method. First, you must also take a look on how the feature may or may not be accepted by your existing client base. Is this feature, which might be a significant enhancement to an existing workflow, going to cause waves with your existing users? It’s always cheaper to keep an existing client than it is to bring on a new one so if there is a potential for any lost clients, the costs of the risk must be valued into the feature. Secondly, how you ever heard a salesperson say a new feature or enhancement wouldn’t boost sales? Neither have I. Make sure to put actual metrics or past performance in the decision, not the gut feel of your sales force.

Reduced Organizational Risk

Every day your company is at risk. Your contractual obligations and regulatory requirements, depending on the type of your company, could put your daily risk into the millions of dollars. Some of the energy regulatory bodies, such as FERC and the CFTC, can put some serious weight on your company if you do not follow their rules, especially if your compliance gap is around worker safety.

Product features in your backlog could be tied directly to one of these potential risks. When I worked for Enron in their natural gas transportation division, we were engaged in the full rewrite of their entire contacts and capacity release systems for three big pipelines. This project was years in the making and while this project was happening, the maintenance on the existing system had to be brought down to bare bones. Along the way, there are regulatory changes which require system changes, whether it’s operational changes or something as simple as changing the way volumes are aggregated, those mandates have penalties as well as due dates. While doing the system rewrite, these changes were introduced as features and while some of the due dates were given extensions, eventually the time came for us to get the project into production or face fines. Needless to say, those were some long months for us but we were successful on the delivery.

When you are trying to value features associated with risks, there are two ways you can do it. First, there is simple example of a feature that is absolutely required by a certain date and if it’s not there, you will get fined. The potential fine in question is only the baseline of the actual value. There are certain cases where the increased scrutiny into the organization will cost you even more. When there are data requests and audits, the time and expense associated with them can be very painful, not to mention the costs associated with the negative publicity. No company wants to be associated with non-compliance because your clients will view these gaps as potential hazards to their operations.

The really difficult area for valuation of features for reducing risk exposure comes from development of features replacing existing, brittle processes. To use an example from energy trading, if you are running your end of day calculations to determine the cash needed for margin calls, and that process can sometimes “break”, then the number may sometimes be wrong but not cause your organization any harm. For example, if it calculates you need 100k but you only need 80k, then the only value lost is opportunity costs for the extra 20k in cash which could be used in other parts of the organization as working capital. But let’s flip those numbers around and the story changes. Falling 20k short on a margin call could spell disaster for companies, causing not only regulatory scrutiny but could result in a full liquidation of position at a huge loss. To create valuations for these features, the calculation is based on real options analysis, taking the probabilities of these complliance events happening and multiplying by the potential losses.

Technical Debt

Every project accumulates technical debt along the way as decisions are made regarding the architecture of a software project. Some of the decisions creating this debt are caused by external pressures to deliver at all costs, causing developers to hack together solutions quickly with the promise of being able to go back and refactor out the bad code (which rarely happens). In addition, there is also technical debt accumulated as the product requirements change. The ORM decision made during the first month of a project could be a proverbial anvil around the neck of a development team three years later. These decisions are made based on the best available data and sometimes the data is incomplete.

techDebt3

Technical debt has very distinct impacts on the product, specifically inhibiting the ability to deliver value quickly. Adding a feature in one stack could take a day, but in other stacks could create a week long regression requirement and cost ten times as much. If there are pure technological debt issues existing, they need to be put on the backlog and prioritized. Much as when a company does a stock buyback because they have capital on hand and no projects on the books with a good ROI, these technological debt backlog items can (and quite often do) become budget killers adding 25% to the costs, sometimes much more, and when there is nothing else in the backlog more valuable than to refactor out this debt, then pull the trigger and get it done. Refactoring out bad code is like taking the extra minute to put on a pair of shoes before running a marathon.

In conclusion, a good product manager isn’t only someone who can write good use cases. It’s someone who can make good decisions and provide great analysis on where the product should focus. These simple feature decisions could change the bottom line and something as subtle as prioritizing one feature over another could make or break your organization. As always, a product manager is only as good as the support they have from management. Make sure to empower your managers to make decisions, but also don’t be afraid to hold them accountable as to why thing were prioritized the way they were.

Define Your World

I recently finished reading The Dip, a short book by Seth Godin which does a great job explaining how to be successful in the projects in which you choose to engage. The concept is simple.

thedip

  1. Anything that you choose to do (project, work, hobby, etc) has a dip, which is the amount of time and effort required to get between the excitement and enthusiasm involved in a new endeavor and the desired end result – also known as the hump.
  2. Everything has a dip but dips are different. Learning to say hello in 5 languages has a dip of maybe an hour or two. Learning to speak 5 languages fluently has a dip of years, possibly decades.
  3. For those things you are seriously interested in engaging in (not hobbies), such as starting a new company or writing a book, you need to be able to prepare for the dip, recognize it when it happens and work through it by either incorporating strategies to be productive through it or shorten it.
  4. If you aren’t willing to go through what’s required to get through the dip, you shouldn’t even bother. Quit now and save yourself the time and effort to focus on something you are willing to get to through the dip on.
  5. You should also strive to be the best in the world in what you do.

Best in the world? Yep. But Seth throws a little curve ball as his definition of the world comes with a big asterisk. Best in the “world” is being best in the world in which you define it. Looking at it from a consumer, let’s say you want a graphic artist to do some logo work for you. You want the best (who doesn’t, amirite?). But there are several problems with attaining the best. The best is usually already busy with other work they find more interesting (availability), their prices are out of your range (accessibility) and some of them you might not even know about. So you can’t have the best, but you can have the best based on the available graphic artists who are available and willing to take what you are willing to pay. These two aspects just cut your pool down about 95% with two simple characteristics.

But using Seth’s logic, this is what you should be focusing on – being the best of the world of the graphic artists who are freelancers, are willing to do things for less than $100/hr and have bandwidth. All of the sudden, the world as you know it becomes much clearer and even attainable. Now, you can take this world construction down to such a sganular level that it becomes absurd. I am the best blogger in the world named Stephen, using WordPress, who is 6’1″, etc. This doesn’t really do much for me. However, being the best blogger in the world doesn’t give me much value either. Instead, I pick a middle ground. I want to be the best software development and product management blogger in the Houston area. While that pool is still pretty deep, it is also something I can strive towards and it’s an attainable target, if I can get past the dip. My dip involves refining my writing style and increasing the value of my offerings. These are very tough things to work on and require some pretty serious commitments. Your world may change as you develop and attain new skillsets or experience. A software developer’s world may go through several phases.

The best at documentation out of the 3 junior developers on the team.

The best at refactoring the persistence layer on the team.

The best developer on the team.

The best developer in my section.

The best developer in my company.

etc…

If you have some spare hours this weekend, grab a copy of the book and give it a read, especially if you are like me and have many more projects that you start but don’t finish. Reading the book and understanding the concept isn’t hard. Defining your world is the real dip. Start small. Dream big.

Production-ready standalone REST using Dropwizard

There has been a large movement to REST interfaces because of the shift to new mediums, such as mobile, as well as the overall shift to more client-side, in-browser functionality using HTML5/JS. While the server side processing of views still have their place, more and more users are wanting to see less full request-response interfaces, and more ajax and push functionality in their web applications. This is almost creating a new era of a two tiered architecture, shifting more of the domain and business processes closer to the user and leaving data access and other pluggable functionality on the server. If you squint your eyes, you can see the parallels to the Powerbuilder/stored procedure paradigms but the difference now if these technologies are incredibly scalable.

Thanks to the large movement to open source technology used in the big internet shops (Facebook, Twitter, blah-blah-blah), the lowly developer doesn’t have to look far to grab a good starter kit for building some pretty cool applications. One of the latest ones I found is Dropwizard. Per the website:

“Developed by Yammer to power their JVM-based backend services, Dropwizard pulls together stable, mature libraries from the Java ecosystem into a simple, light-weight package that lets you focus on getting things done. Dropwizard has out-of-the-box support for sophisticated configuration, application metrics, logging, operational tools, and much more, allowing you and your team to ship a production-quality HTTP+JSON web service in the shortest time possible.”

The cool thing about Dropwizard is the fact the services are turned inside out. Instead of the software being embedded in a container, the services usually associated with containers are embedded in the application. It’s a J2SE-based HTTP/JSON/REST framework that doesn’t require the deployment and installation of containers or web servers. The reason it’s cool is scalability part of it. If you want more services, you can horizontally scale on the same machine by simply starting some additional instances.

The other cool thing is that their documentation to get started is damn good. While there are tons of blogs out there that really serve a great need in further refining the existing documentation, Dropwizard needs no help. So I am not going to try to reinvent the wheel here. Go get started: http://dropwizard.codahale.com/getting-started/

After doing some development with the platform, here are some observations:

1) It’s really good for REST based services. But if you need to serve up any UI stuff outside of a pure HTML5/JS model, then you are going to have to jump through some hoops to integrate any view technologies into the stack such as JSF. Sure it has some cool capabilities to add assets to the web container to be served up, but once you start going there, there are some hurdles and fine grained knowledge needed to get things to fit right.

2) Security model requires some internal development especially if you want to do something other than Basic HTTP authentication and can get hairy. I tried multiple authentication/authorization models and even tried extending some of the existing Framework classes. While it can be done, it’s not easy.

3) No built in Spring integration and no Java EE CDI stuff. Not hard to initialize and get it into the model.

4) The stack is a best of compilation of some of my favorite things: Jackson, Hibernate Validator, and others. I also got introduced to some new stuff like JDBI. If anything, Dropwizard gets you to possibly take a look at some other ways to do things and perhaps might even change your development stack.

Overall, it’s worth a weekend for the tinkerer and a POC after that at your shop, especially if you are deploying a mobile app with a lot of users.

Go follow the developer @Coda

Couple of Dropwizard links I found helpful:

http://brianoneill.blogspot.com/2012/05/dropwizard-and-spring-sort-of.html

https://speakerdeck.com/jacek99/dropwizard-and-spring-the-perfect-java-rest-server-stack

http://www.gettingcirrius.com/2013/02/integrating-spring-security-with.html

Putting software inside other software

I’ve lived most of my technical life in a web based application. Because I am a java guy and don’t like Swing development, when I build something, it usually had a web based interface. There are two really great things about web UIs: anyone can use them and they are horizontally scalable.

When you live in web applications, you are also mostly living in application servers. There are so many names and flavors you can get lost really fast. I came to age during the popularization of J2EE and did quite a bit of work on the behemoth platforms like Weblogic and Websphere, which I don’t think anyone can still justify why we used them. Cutting your teeth on EJB 2.0 CMP entity beans is not the most pleasant experience. Couple that with my experience with CORBA before that, and you’ll understand why I love Spring.

Recently I have been diving back into the technology, taking a particularly longer look at JEE technologies. I’ve been using JSF for years but always spring-backed. I’ve taken a liking to the JBoss AS 7 capabilities and even the TomEE server. But the question still begs to be answered: why do we need these bloated application servers anyway?

With REST and simple embeddable web servers such as Jetty, you can get some of the goodness of a J2SE environment and none of the snowflake configurations required by the “standard” app servers. There is even some open source tools at your disposal, most notably Dropwizard, which allows you to have some pretty robust standalone web resources without the containers (well, the containers are still there but they aren’t as instrusive). These platforms do have their drawbacks, requiring some pretty extensive work to handle security and if you need a full blown UI using JSF, it can get painful.

What the industry needs is an easily deployable standalone application model that doesn’t require extensive custom configuration. I need to build an application, deploy it over there, create some network rules and start it. And if I need it to cluster, it can be self aware of it’s clustering environment (Hazelcast has spoiled me). When your application developer is spending more time building plumbing and security rather than domain objects and service layers, it’s a problem. Don’t get me wrong, there are plenty of useful and relavent cases where large application server infrastructure is not only necessary, but can be a strategic advantage in terms of scalability and ease of deployment. I just wish there were more choices.

Are you ready for the cloud?

The “Cloud” is one of the most overused and misunderstood buzzwords in technology today. For most people who have watched too many Microsoft commercials touting their cloud integrated operating systems, it’s this abstract resource in the sky that everything magically connects to and things happen. Most people do not understand that it’s simply an external network of computation resources (computers). Just like you have your server at work with all your spreadsheets on it (the ubiquitous S: drive or something), the public cloud is simply a set of computers located at a data center that you have access to.

But this is where the true cloud conversation starts, and there are so many other aspects to the cloud people do not understand. To understand cloud computing, especially public cloud such as Amazon EC2, you must first understand virtualization. To understand virtualization, you must need to understand operating systems and how they work with your computer. Essentially, a computer is made up of a bunch of different components (hard drive, memory, network card) that talk to each other through a common physical pathway(motherboard). But there is another pathway required for these communications and that is the operating system (OS). Each component, such as the hard drive, only knows about itself and has an interface in which it allows software to interact with it via the physical connection, and this software is typically the OS. When most people buy a computer, it has a single OS on it that single handedly controls all the hardware interactions.

Virtualization simply adds an additional layer to allow multiple OS access to the same hardware. Instead of having one OS running on a computer, you can have two or more running as a virtual machine (VM). The virtualization software acts as a traffic cop, taking instructions and other interactions from each OS and ensuring they all get access to the hardware devices they need, but also make sure that each of the devices get the needed information back to the correct OS. There are lots of examples of virtualization software, most notably VMWare and Virtualbox, that will allow users to run multiple OS on their machines. You can run an instance of Ubuntu on Windows 7. There are huge benefits to be gained from this, but when it comes to public cloud computing, the main benefit is shared hardware and abstraction of computing resources away from physical hardware.

Once you understand how virtualization works, it’s not a big leap to realize the public cloud is simply an automated virtualized environment allowing users to create new VMs on demand. The user doesn’t have to worry about having enough hardware available or even where that VM is located. They simply create an instance of a VM by specifying attributes of how they want the machine configured, such as processor speed or memory capacity, and don’t care about where or how. The public cloud is simply a manifestation of a concept that has been maturing for quite a while – turning computation resources into a homogeneous commodity rather than a specialized product.

This is the point to where light bulbs start turning on above the heads of executives. They start looking at these opportunities to use generic, commoditized computing resources and remove the risks and costs associated with maintaining and managing data centers. All of the hardware that sits unused for weeks at a time because it’s only needed for special cases, like emergency production support releases, can be sunset. We can build performance testing environments in a single day and then delete the entire thing by the time we go home. The possibilities are endless.

But let’s be clear about something. There is something about public cloud infrastructure that makes it special. It’s not the virtualization software. It’s not the hardware used. It’s the people.

Public clouds like Amazon EC2 have the best and brightest operations engineers on the planet, creating automated processes behind the scenes so that users like us just need to click a button. It’s not easy. Their environment is a hurricane of requests and issues that most people can’t dream of. They manage a half-a-million linux servers. 500,000 servers. Most people ask how they do it and the answer is simple: they automate everything they can. Implementing that simple answer is where most people start to run into issues. Luckily, Amazon hires the best ops people in the world and probably pays them a small fortune to do this, both of which is simply not available to most businesses. Public cloud is about standing on the shoulders of the best operations talent in the world and taking advantage of their automation procedures.

Remember those light bulbs I referred to earlier above the heads of executives? Let’s take a sledgehammer to them with a single statement: All your data that you put into the public cloud is on the shared hardware. Your CRM data that you have in your cloud database? It could be sitting on the same drive and the same platter as your competitor. Your super-secret computation process could be sharing processing time with Reddit on the same CPU. The public cloud means we all share the same hardware. While there are quite a few security measures to make sure the virtualized segregation stays in place, we all know what typically happens – all secure data is deemed not cloud worthy and our virtualized hopes and dreams die with a whimper.

Until someone says the words “Private Cloud”.

Here’s the problem with “Private Cloud”. The public clouds are good because of the people and the culture. It’s not the tools. In fact, I would be willing to bet that 99% of the tools used by Amazon could easily be purchased or found via open source. Most organizations simply don’t have the resources, the processes, or the intestinal fortitude to engage in such an endeavor. You need three things to be successful: a culture of automation, talented hybrid operations and developers willing to blur the lines between coding and deployment and tools to automate. You can buy the tools. You can buy the people. You can’t buy culture.

Let’s get past the limitations I stated earlier and create a hypothetical company with it’s own private cloud. Unless your software is built for the cloud, your not really buying anything. When I say built for the cloud, imagine being able to scale an application horizontally (think web application) by adding new web servers on demand. Here’s a basic list of how to create a VM from scratch and take it into a production cluster:

  1. Create a new virtual machine.
  2. Install the correct image/OS.
  3. Automatically configure the VM to the specifications needed for the application (install Java, run patches, etc).
  4. Install any containers (JBoss, Tomcat, whatever) and configure them automatically (JMS, DB Connections, Clustering with IP addresses, etc).
  5. Deploy an EAR or WAR from your internal build repository and have it automatically join any existing cluster seamlessly.
  6. Configure the network and other appliances to recognize and route requests to the new server automatically.
  7. Do everything just stated in reverse.

Unless you can do something like this automatically, then you aren’t cloud. You are virtualized. Not that simply being a company that utilizes virtualization isn’t a great first step. There are many uses of virtualization that could provide many benefits to a company quickly, such as having virtualized QA environments that can quickly be initialized to run emergency production support releases. Virtualization is a great thing by itself, but virtualization is not cloud.

Secondly, the fourth and fifth bullet points are where most software shops get caught up. Their applications are simply not built for it. When the architects laid the groundwork for the application, they didn’t think about having the ability to quickly add new instances of the application to the cluster such as not using DNS to handle routing requests. There are some decisions that are made or even not even addressed that are core to an application that can’t handle these types of environments. It’s a square peg in a round hole and while some applications can be retrofit to handle the automation, some will need to be rearchitected. There are even some applications who don’t make sense in the cloud.

I encourage everyone to give virtualization a look. Every organization who has multiple environments for development, QA and UAT would benefit from virtualizing them. There are many software packages and platforms out there that are easy to use and some that are even open source. But before you start down the cloud path, make sure you do your due diligence. Are you ready for the cloud? Sometimes the correct answer is no. And that’s OK too.

Dodd Frank Implementation – Great Case for Agile Development

The Dodd-Frank Act (DFA) has been a major disruptor in 2012, especially in the energy industry. For those of you who are not familiar, DFA has created a set of pretty extensive external reporting requirements, both for trade life cycle events but also for data aggregations. For some organizations who are Swap Dealers, these reporting requirements are a major burden, requiring external reporting of trade life cycle events, such as execution and confirmation, in time frames as short as 15 minutes. In the financial services world, these reporting burdens are not as big of a leap, as the infrastructure and generally accepted practices of external, real time communication to facilitate straight through processing is prevalent. The energy and commodities world is not on the same level of sophistication, both because commodities trading tends to require much more bespoke and complex deal modeling, but also simply because they never needed to report events externally real-time.

In addition to these requirements, there has been volatility in the rules. Certain rules, such as Position Reporting, have been vacated (for now) leaving many projects in mid-flight. Other rules, such as Swap Data Repository reporting (Part 45), had data interfaces and workflow definitions offloaded on multiple vendors (p 2139), resulted in a very fragmented ecosystem where many-to-many data mappings and formats were required for different asset classes. Additionally, SDRs are implementing their systems during these different rule changes and clarifications, resulting in a fairly unstable integration task. This type of work is perfect for agile development.

  • Short sprints would allow you to push out functionality in short time frames, giving the team a natural checkpoint to make sure the functionality was still in-line with the latest legal opinion or CFTC change (Physical Options, anyone?). Every two weeks, the team can stop, demo the functionality they have built and receive feedback for the next sprint. Volatile requirements require a tight, frequent feedback loop. If you are building a huge technical spec document for a DFA implementation, you are toast.
  • Code demos naturally push end-to-end testing, giving the users a look at the real implementation rather than waiting until the last minute. The users can make adjustments at an earlier stage, reducing project risk and increasing customer satisfaction.

I would highly encourage all the companies who haven’t started your DFA efforts to look to agile to manage the project. Your developers and your users will thank you for it.

Demo Code: Create Persistence Jar using JPA

I love keeping my repository code in a single jar, isolated from all other code. Persistence code should be portable and reusable as a library specific to a database or even a schema. This wasn’t always the easiest thing to do, especially in an ecosystem where the library may run in a Spring based webapp, a swing gui and a Java EE EJB application. Here’s the template code for how to get that ability.

First, let’s look at the basic EntityManager usage pattern. There are much more sophisticated ways of doing this but I’ll keep it simple for my own sake.

[java]
//Get the correct persistence unit and EntityManagerFactory
EntityManagerFactory entityManagerFactory = Persistence.createEntityManagerFactory("demoManager");
EntityManager entityManager = entityManagerFactory.createEntityManager();
entityManager.getTransaction().begin();
//Create an object and save it
entityManager.persist(new ApplicationUser());
//We are just testing so roll that back
entityManager.getTransaction().rollback();
//Close it down.
entityManager.close();
entityManagerFactory.close();
[/java]

JPA Persistence is driven from a file in your classpath, located at META-INF/persistence.xml. Essentially, when creating the EntityManagerFactory, the Persistence class will go look for the persistence.xml file at that location. No file? You get a INFO: HHH000318: Could not find any META-INF/persistence.xml file in the classpath error. Eclipse users: Sometimes you gotta clean to get the file to show up for the junit test. Here’s a simple persistence.xml that shows how to use JPA outside of a container.

[xml]
<?xml version="1.0" encoding="UTF-8"?>
<persistence xmlns="http://java.sun.com/xml/ns/persistence"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://java.sun.com/xml/ns/persistence http://java.sun.com/xml/ns/persistence/persistence_2_0.xsd"
version="2.0">

<persistence-unit name="demoManager" transaction-type="RESOURCE_LOCAL">
<class>com.stephennimmo.demo.jpa.ApplicationUser</class>
<properties>
<property name="javax.persistence.jdbc.driver" value="org.hsqldb.jdbcDriver" />
<property name="javax.persistence.jdbc.user" value="sa" />
<property name="javax.persistence.jdbc.password" value="" />
<property name="javax.persistence.jdbc.url" value="jdbc:hsqldb:." />
<property name="hibernate.dialect" value="org.hibernate.dialect.HSQLDialect" />
<property name="hibernate.hbm2ddl.auto" value="create-drop" />
</properties>
</persistence-unit>

</persistence>
[/xml]

Notice when you create the EntityManagerFactory you need to give it the name of the persistence-unit. Rest is pretty vanilla, but if you need some additional explanation.

Next, let’s look at the basic JPA object.

[java]
@Entity(name="APPLICATION_USER")
public class ApplicationUser implements Serializable {

private static final long serialVersionUID = -4505032763946912352L;

@Id
@GeneratedValue(strategy=GenerationType.IDENTITY)
@Column(name="APPLICATION_USER_UID")
private Long uid;

@Column(name="LOGIN")
private String login;

//Getters and Setters omitted for brevity sake

}
[/java]

And if you want to use the JPA in a container, here’s a simple example of how the persistence.xml would change.

[xml]
<?xml version="1.0" encoding="UTF-8"?>
<persistence xmlns="http://java.sun.com/xml/ns/persistence" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://java.sun.com/xml/ns/persistence http://java.sun.com/xml/ns/persistence/persistence_2_0.xsd" version="2.0">

<persistence-unit name="demoManager" transaction-type="JTA">
<provider>org.hibernate.ejb.HibernatePersistence</provider>
<jta-data-source>java:/DefaultDS</jta-data-source>
<class>com.stephennimmo.demo.jpa.ApplicationUser</class>
<properties>
<property name="hibernate.dialect" value="org.hibernate.dialect.HSQLDialect" />
<property name="hibernate.hbm2ddl.auto" value="create-drop" />
</properties>
</persistence-unit>

</persistence>
[/xml]

That’s basically it. The most painful point comes if you are trying to use a persistence jar inside EJB jar, it REQUIRES you to list out the classes in the persistence.xml.

As always, demo code available at my public repository.

http://code.google.com/p/stephennimmo

« Older posts Newer posts »

© 2017 Stephen Nimmo

Theme by Anders NorenUp ↑