Enterprise Architecture in practice: 2011

Wednesday, December 14, 2011

Dont let the Enterprise Service Bus lead to Context Bleeding

In this article I would like to discuss bad usage of integration patterns and SOA tools, and why I often favor the Anti-Corruption Layer pattern of Domain Driven Design. I observe that many SOA projects does not end up with something more service oriented, but actually an event bigger ball of mud. Suddenly many more systems must be live, projects slow down and things get more complicated. The ESB becomes One Ring to Rule them All, which is not a good thing.
Why is this?

Good Service Orientation should bring clear separation of concern, easier maintenance, less code, independent deploy and release cycles, more frequent releases, easier sourcing, and a higher degree of flexibility (among others). The idea of a bus (ESB) is good enough, but it does not relieve you from the real challenge; complexity and functional dependency. Where you previously had FTP files separating silos, they now must both be up and running. When things break you have 100.000 broken messages on the bus to clean up. These messages are a long way from home; they break out of context. Probably they are better understood within their domain.

The challenge is to find a design and migration strategy with lower maintenance cost in the long run. You should make things simpler and testable, by using DDD on your system portfolio.

The intention with these integration patterns are good; the Aggregate and Canonical patterns promise encapsulation but often end up with handling complexity outside of its context. That leads to a tough maintenance situation.

Scenario

The initial stage where silos send and depend on information directly from each other:

Silos supporting and depending on each other

An ESB tries to make things easier
, but the dependency is still there

Secondly the ESB comes to the "rescue". We just put a product inbetween and pretend that we now have loose coupling. We may get technical looser coupling and reuse of services/formats, but the functional dependency is still there. Most probably this is not a situation with less maintenance, you have just introduced more architecture. In most integration scenarios people with deep knowledge of their silo talk to each other, directly. The canonical format supported by an integration team is just a man in the middle, they will strive to understand the complexity behind services and messages. Then your total maintenance organization does not scale, the functional throughput of projects slow down, because the integration team need to know "everything".

Black Octopus tentacles

There is also another problem with this approach. Most tools (and actually their prescribed usage as taught in class) let the ESB product make adapters that go into the silo. Many silos have boundaries, like CICS, but others offer database connections so that adapters for the ESB actually glue into the implementation (of the silo). Now we are getting into en even more serious maintenance hell. Each silo has a maintenance cycle and organization supporting the complex systems it is. By not involving this organization and not letting this organization support the services they actually offer to the environment, you will have trouble. The organization must know what they silo is being used for. How else are they going to support SLA, or make sure that only consistent data leave the silo? This is illustrated as a black Octopus with tentacles into the systems having different parts of the organization tied up even closer.

Context Bleeding

This very fast leads to context bleeding. Every EBS vendor has tools and repositories for maintaining the Canonical format. The problem with this is that it is maintained outside its domain, and the organization supporting it. Now many Entities and Aggregates are outside their Bounded Context... Or even worse they are replicated outside, endorsed in a more generic representation, where the integration team has put an extra layer of complexity on top of it. This generic representation also hides the ubiquitous language, making communication between organizations even harder. And just to add to this; how testable is it? This is your perfect "big ball of mud". You do not want to handle complexity outside its domain.
It is a much better situation for those with the deep knowhow of the silo to construct and support the services and canonical messages they offer. The integration team should mostly be concerned with structure and not content.

The same can be said about business processes orchestrated outside of their domain; it may get into a "make minor" approach that does not enhance ease of maintenance. Too often there is high coupling between process state and domain state. (see Enterprise Wide Business Process Handling)

Better approach

Dark green ACL maintained by each silo

So I think a better approach is to use Domain Driven Design and the Anti Corruption Layer. This pattern better describe the ownership and purpose of integration with another domain, while keeping clear boundaries. Maintenance and release cycle is now aligned with the silos maintenance cycles. I also think there is a better chance for higher level services where systems cooperate. This leads to simpler integration scenarios, illustrated by a slimmer ESB.

This is not complete without emphasizing the importance of functional decomposition between the silos, so that they have a clearer objective in the system portfolio. But this takes time, and often you need an in-between solution. ESB-tools are nice for such ad-hoc, but don't let it be your new legacy. Strive for granular business level services, so that you limit the "chatting" between systems and make usage more understandable (but this standardization is more a business challenge, than an IT challenge). Too many ESB's end up like CRUD repositories; illustrating only an open bleeding wound of the silo.

The objective is: Low Coupling - High Cohesion. Software design big or small - the same rules apply.

Dont let the Enterprise Service Bus lead to Context Bleeding by Tormod Varhaugvik is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Tuesday, September 27, 2011

Tax Norways Proof of Concept

Why a Proof of Concept?

Continual Aggregate Hub and Processing
In our Target Architecture there is one central part (we will test out concepts from Continual Aggregate Hub, "Restaurant" and Aggregate Storage) and it has to do with tax and fee calculation. It does collection, assessment and calculation in one scalable architecture, and this has to be flexible as to collecting new types of information and putting new calculations on them. We believe it is unlikely that we find some COTS in this area, as this data and its rules are highly domestic and decided by our fellow politicians.

Our priorities:

Maintenance - Modular, clear and clean functional code

Testable - Full test coverage

Speed - Liner scalable. Cost of HW determine speed, not development time.

Yes we can!
We have seen that there are so many properties in the large scale financial architectures (as the ones John Davies has talked about in several talks over the years) that are similar to ours, that we have to find out if our domain and challenges can be solved with this.

This is really the core of the PoC; we have to play, test and learn what this architecture is like for our domain and how it affects our ability to change. We need to show this to our stakeholders; business, architects, operations, programmers, designer etc. We must have this experience to leverage risk in planning, understand the cost, to be able to communicate between these groups, and to be able to describe in more detail.

In practice we do a full volume test for 2009 and 2010, but with limited amounts of basic data and business rules, and not fully end-to-end. We will tackle the core challenges, with a small rule set, and also have a "version 1" blue-print for new business initiatives we know will be coming. We target 2 years in memory, and assessment and tax calculation at 5000 tps. We target HW costs at 10% of todays level.
It is a playground for new technology. We are testing the domain solved in this type of architecture, and it is not a test of a product. Any acquisition of product will be done later.

Just do it
Late spring this year we opened a bid for participating in this Proof of Concept, where we presented our thoughts about the target architecture (We have gotten a lot of good feedback on this :-) ). EDB/Ergo, Bekk won the bid, and they teamed up with Incept5.
We are starting with Gemfire as the processing architecture, and as much plain vanilla pojo as we can. We will be working on this through January 2012. This will rock!

2012.01.22: The results are here, and will be presented at Software 2012.

Tax Norways Proof of Concept by Tormod Varhaugvik is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Thursday, September 22, 2011

Enterprise wide business process handling

Introduction
(October 2014. This article is updated with the Case Container)
Understanding the requirements, functional and non-functional, is vital when the architecture is defined. The theme in this article is about enterprise wide business process handling. The challenge is to make different types of systems participate in the business work flow and understand what happened for later insight. BPM, Business Process Management, is a huge field with many tools, notation forms, patterns etc. (See process handling in Continual Aggregate Hub and cooperating federated systems). (A system in this text is the same as a Module, although it may reflect that our current systems are not good Modules. We want our current systems to cooperate as well as our new Modules.)
We have chosen to think Domain Driven Design and to elaborate on these main issues:

how to automate the business process between cooperating Modules
how to ensure that some manual (human) task is done
how to achieve high degree of flexibility in changing business processes
how to know the current state of a Party
how to know what has happened to a Party
how to measure efficiency in the business activities over time
how to make very different types of systems cooperate

And some non-functional requirements:

how to handle the operational aspects over time (batch vs events)
how to best support that a manual task is now automated
how to make it testable
how to have it scalable
how to change and add Services and Modules without downtime
how to change process without downtime
how to re-run a process, if somethings have failed for some time
how to handle orchestration (defined process) and choreography (event drive) side by side

Observations of SOA tool state and of integration patterns
I will not discuss this too much, but just summarize some observations (I'll get back to how in some later blog):

In-flight processes is a pain
Long lasting processes is a pain (both: software needs update now and then)
Often not loose coupling between process and the state of some domain entity. A lot of processes occur and function within the domain ("make-minor" approach by some process-system is not good)
Processes are hard to test
Scaling and transaction-handling gets complex
Tools have too much lock-in
The promise of BPEL visual modeling to communicate with business fails
The canonical integration pattern often leads to context bleeding and tight coupling
The aggregate integration pattern often is a sign of complex integration, that probably should be addressed with a system by itself
Business process state is hidden, and history of events is lost or is drowned in technical details
Message brokers are great to move messages, but bad in history, and not a good tool in operational flexibility
Too many parties are involved so that maintenance gets slow

Our logical 5 level design
Main goal is to have full control of all events that flow between cooperating Modules, but not achieve an uncontrollable event driven system. (An event-driven system may just as well be diagnoses as "attention-deficit/hyperactivity disorder (ADHD)" system.)

Enterprise Process Log
This is purely a log component, with services for registering business and answering queries about them. It has no knowledge of the business process, but of course has some defined keys and types that defines a valid business event (a business activity leads to a business event). It is the Modules that emit domain events to the log component, and the domain defines what the events are. It is a place to store the business level CQRS type of events (or Soft State if you like from BASE), the more detailed events are kept the Modules. Any type of system can emit events either live or in large batches. The implementation effort to send events is little, and the events may be informational and not necessarily used in an automated process. This log will give insight into what happened with a Party. The lifetime principle is taken care of, as this log must be very long lasting. So we want it as simple as possible. (see The BAM Challenge)
Global Task-List
This is a simple stand alone task-list whose sole responsibility is to assign tasks to groups of Case Handlers (CH) that do manual tasks in the enterprise wide domain. The task-list has no knowledge of the overall process. The tasks the task-list receive contain enough information to explain what the task is, and to redirect the CH to the right GUI for doing his task. The tasks are defined and maintained in the Module that has a need for manual work, but dispatched to this common task list. These tasks are tasks that are well defined in a work-flow. When a task is done, or no longer relevant (the Modules that owns the task decide), then the task is removed from the task list.
Process Flow Automation
When we first have the events in the event log, automating the next step is quite easy and can be done in a lightweight manner. A simple Message Driven Bean may forward the message in a event driven manner via some JMS-queue, or a large legacy system may query for a file once a week because that is the best operational situation for it (operational flexibility is also discussed in CDH). Also events may be logged that only come in operational use a year later, making maintenance flexible, and history robust.
Case Container
This is discussed in this article about a generic Case Container representing all Cases handled in our domain. Its purpose to to contain the Case with all its metadata, the process state and references to all incoming and outgoing xml-documents.
Case Handling System (super domain)
This is usually called Case Handling and consists of a case which is outside the existing Modules or systems, and that need the completion of many formal sub-processes, but at startup of the process it is not possible to foresee how the case will be solved. This is typically where know business process ends, and a more ad-hoc process is applicable. Also this system support the collection of different types of information relevant for such case-to-case systems. This information may very well be external information collected in a manual matter.
(by 2014 we have not gone any further into this part of the overall design. It seems that the CAH and the Case Container is sufficient)

Above is the logical design, and this is what we think we need. You might say it follows a hub-spoke design, where the Modules are the spokes and these 4 elements comprise the hub. These are all 4 discrete components that interact in a services oriented manner, with each other and with other Modules or systems. The main idea is that this will enhance maintenance and reduce the need for customizing COTS.

Ill 1. Basic flow and service orientation

Illustration 1. Just to show a fire-and-forget situation (green) where the Tax Calculation and Collection are interested in House events. Tax Calculation wants them live, and the processing forwards the event, while Collection every 2 months, via file, and then issues service requests (yellow) to Party to get details. The EPL event holds a reference to the Case Container. The notified Module then uses the Case Container to open all relevant information for this Case and the process state the Case is in.

Ill 2. Application layer interacts with EWPH

Illustration 2. Show a Module of DDD where the application layer interacts with the enterprise wide process handling. The green line show how the Case Container is shipped to the Module where its reference are opened by the Application layer and sent to the Domain layer for handling. The Case container is really the container that it is in the logistics world; bringing a complete set of good from one place to another.

We are implementing this with REST and XML, where feeds play an important part in transporting events and data. URI represent addresses for Modules, and are linked to specific Case Container Types.

We do not mandate the usage of the Task-list. If there is some need for task-list internal to a Module that is more efficient for the users that handles tasks solely in a Module, it is OK when it gives a better maintenance in the long run.
Also we do not mandate what technology that does different work in the Process flow automation, it may be Message Driven Beans forwarding to some queue, Mule, Camel, BPEL for some orchestrated process, or simple Query to file.

Design and overview
Of course you still have to have a good design and understanding of the business processes. And it must be there to communicate between business and IT (BPMN is great for this). But as in other areas of systems development: Design is to communicate between people, and implementations is to communicate to machines. Therefore a combined design-implementation (eg. BPEL) will have a hard time achieving both.
The business process is not fragmented, but I argue that the implementation of the business process is best handled in the above-mentioned manner; The process will sometimes occur within a Module (system), and sometimes in-between.

Enterprise wide business process handling by Tormod Varhaugvik is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Tuesday, September 20, 2011

The BAM challenge

One interesting issue arose when we looked into the requirements for Business Activity Monitoring (BAM). The main task to answer is: "How is the organization performing?", "Are we more effective than last year?". Key Performance Indicators (KPI) is a well established jargon, but how to achieve them?
A KPI can only be answered over time, and for large governmental organizations, that's a long time. Just reorganizing may take a year, and making the new organization perform better takes longer than that.

To measure you have to have some measuring points, and they are to be compared over time. It is not about a business performance measuring tool by itself, it is about what business activities where performed, and how much effort some business entity put into one such activity.

There are three challenges in this:

First a reorganizing usually affects how account dimensions in the economical systems are organized, that makes the accounts discontinuous and disturbs the measurement of the effort put in. There must be defined some long lasting Business Effort measured in cost (Business Cost, BC). When the account dimensions are defined, they must also map to this BC, so that we keep continuity.
There are challenges to this; "How much does an IT system cost?", "What is the price of this automated task". (Many organizations have a hard time identifying this).
Secondly the Business Activity (BA) itself over time is performed in different IT systems or done manually. From experience with performance tuning, I believe it is obvious that the IT systems must support a long lasting BA that survives various implementations of that activity. That is what the Enterprise Process Log (see Enterprise Wide Process Handling) is all about. This is where we collect and keep the BA's.
The BC and BA must have a comparable period of time.

The BAM-tool is not the solution by itself, and we don't want too much tied up on some SOA implementation. Also BAM-tools are often concerned with what happens on the ESB, there are so many other places that may emit BA's and BC's. The solution must be simple and must last long.

KPI1=BC1/BA1 for a period
KPI2=BC2/BA2 for a period

In our domain a BA would be "Number of tax statements processed" or "Complaints handled" during some time period. The BC would be "Cost of people and systems for assessing tax statements", and "Cost of complaint department", during the same time period.

We will collect BA's from the Enterprise Process Log, and use our data warehouse for the compilation with the BC's and the analysis. The analytics might be in Excel, although we may buy something for this analysis and reporting.
It is the long lasting measuring points and a standardized period of time that is the real value.

The BAM challenge by Tormod Varhaugvik is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Saturday, September 10, 2011

Skatteetatens Arkitekturprinsipper

Hensikt
Dette dokumentet beskriver obligatoriske IT-arkitekturprinsipper (som en del av SKEs overordnede virksomhetsarkitektur).

Regler for anvendelse av arkitekturprinsippene:

All utvikling, videreutvikling og anskaffelse av IT produkter og systemer skal følge disse prinsippene.

Arkitektur- og designavgjørelser samt valg av løsningsalternativer skal lenkes opp mot prinsippene.

Unntak og avvik må begrunnes og godkjennes iht. den definerte virksomhetsarkitekturprosessen.

Bakgrunn
Arkitekturprinsippene er utarbeidet høsten 2009 som en del av virksomhetsarkitekturprosjektet. Prinsippene er utledet fra SKE’s IT-strategi og prinsipper gitt av Difi (”Overordnede IKT-arkitekturprinsipper for offentlig sektor” versjon 2.0 utgitt av Direktoratet for forvaltning og IKT 8. oktober 2009).

Innhold
De første prinsippene er brukerorienterte og beskriver hvordan funksjonalitet skal tilbys brukere (innbygger, skattyter, partnere (offentlige etater og 3. part) samt interne medarbeidere) og går på:

Tilgjengelighet
Brukervennlighet
Integritet
Samhandling

De resterende prinsippene er mer teknisk motivert og beskriver hvordan IT løsninger skal utformes og velges

Endringskapasitet
Gjenbruk
Livsløp

Tilgjengelighet
Beskrivelse	Funksjonalitet skal være lett tilgjengelig for brukerne der og når de trenger det.
Motivasjon og mål	I dagens elektroniske samfunn har eksterne brukere en forventning om å ha tilgang til offentlige tjenester og informasjon uavhengig av tid og sted. Målet med dette prinsippet er både å gjøre det lettere å handle riktig, samt at man gjennom stor grad av selvbetjening sparer etaten for arbeid og ressurser.
Underprinsipper, fokus og avgrensning	Hovedfokus for dette prinsippet et at tjenester for innbyggere, skattytere og partnere som hovedregel skal være tilgjengelig: 24/7, men merk at denne forventningen gjelder muligheten til å kunne utføre sin del av arbeidsprosesser med det offentlige (sende inn skjema, svare på forespørsel, spørre om informasjon etc.), og at det ikke medfører krav til at forespørsler om [saks-]behandling i etatene skal bearbeides og svares utenom normal arbeidstid. Over de kanaler som brukeren forventer/ønsker (multikanal - dette gjelder for tiden spesielt internett og papir samt noe begrenset SMS). Underprinsipper: Det er viktig å være tydelig på hvilke tjenester som skal være tilgjengelig, over hvilken kanal, og til hvilken tid. (F.eks. "send inn skjema" er tilgjengelig hele døgnet, mens "ring meg tilbake" er innenfor 8-16) Det er viktig å være tydelig på hvilke QoS "Quality of Service" som brukeren kan forvente. (F.eks. det innsendte skjerma har avregning i midten av Juni dette år). Avgrensning: Tjenester for interne medarbeidere har ikke like strenge krav til tilgjengelighet. Her er hovedregelen at tjenester skal være tilgjengelig på dagtid og på arbeidsplassen til medarbeideren.
Konsekvenser
Referanse og relasjon til andre prinsipper	Dette prinsippet tilsvarer Difis prinsipp ”Tilgjengelighet”.
Anbefalinger

Brukervennlighet
Beskrivelse	Elektroniske brukertjenester skal være tilpasset brukeren og dennes bruksmønster, legge til rette for effektiv bruk, samt sikre god kvalitet på oppgaven som utføres.
Motivasjon og mål	Målet med dette prinsippet er å gjøre det enklere for brukeren å handle riktig. For interne brukere er dette viktig for å oppnå høyere effektivitet og motivasjon. Dette innebærer at det skal være lett å finne korrekt og relevant informasjon, og å orientere seg i hva brukeren kan gjøre enkelt å oppfylle sine plikter, å gjøre brukeren i stand til å orientere seg i hva som forventes av han lett å unngå unødig rapportering, dvs. bruk av forhåndutfylling og unngå dobbeltrapportering
Underprinsipper, fokus og avgrensning	Brukervennlige tjenester og deres arbeidsflater skal følge disse underprinsippene: universell utforming av arbeidsflater enkel og oversiktig tilgang på nødvendig og nyttig informasjon arbeidsflater eller skjermbilder skal for brukeren inngå i en "dialog", selv om skjermbildene hver for seg tilbys fra forskjellige tjenester i forskjellige systemer (enhetlig teknisk plattform for GUI) enhetlig teknisk plattform for GUI er også viktig for å tilby brukeren enklest mulig håndtering av sikkerhet. Avgrensing: For interne brukere er det viktigst å fokusere på at arbeidsprosesser skal kunne gjennomføres enkelt og "smidig", f.eks. uten at brukeren skal måtte starte og logge seg inn på mange forskjellige systemer og arbeide i mange forskjellige typer skjermbilder innenfor en og samme aktivitet.
Konsekvenser
Referanse og relasjon til andre prinsipper	Dette prinsippet tilsvarer Difis prinsipp ”Tilgjengelighet”.
Anbefalinger

Integritet
Beskrivelse	Funksjonalitet skal tilbys i henhold til lover og regler, sikkert og konfidensielt, og med en høy kvalitet slik at den sikrer integritet for både Skatteetaten, partnere og skattytere/innbyggere.
Motivasjon og mål	Skatteetatens systemer må sikre at etaten gir et godt og rettferdig tilbud til innbyggerne. Skatteetaten skal behandle informasjon konfidensielt og ha sikker identifisering av innbyggere slik at Skatteetaten har troverdighet. Skatteetaten skal framstå som en seriøs aktør som er bevisst sitt samfunnsmessige ansvar.
Underprinsipper, fokus og avgrensning	Hovedfokus: Prinsippet innebærer at sikkerhet og funksjonalitet iht. lover og regler har høy prioritet. I tillegg skal det være enkelt å få et fullstendig bilde av en ekstern brukers / partners saker for å gi en best mulig service. SKE bør gjennom Grønn IT vise bevissthet ovenfor miljøansvaret. Dette stiller krav til oppfyllelse av følgende underprinsipper: sikker identifisering i forbindelse med bruk av løsningen sikker identifisering i forbindelse med beregning / behandling i systemet (oppgaver / beregning av skatt etc.) konfidensialitet av informasjon skal garanteres tilgangskontroll på både dataelement og dataverdier sporing av endring og oppslag på data (audit / historikk). sporing av systemmessige beslutninger (transparens) sterkt fokus på kvalitet (korrekthet) og etikk. Korrekthet sikres best ved å forstå og å kunne etterprøve utfallsrommet til en beregningsmodul
Konsekvenser
Referanse og relasjon til andre prinsipper	Dette prinsippet tilsvarer Difis prinsipper ”Sikkerhet” og ”Åpenhet”.
Anbefalinger

Samhandling
Beskrivelse	Funksjonalitet skal tilbys på en slik måte at det er enkelt å samhandle innen og med etaten.
Motivasjon og mål	Det stilles krav om økt standardisering, samordning, fellesløsninger og automatisering på tvers av etater og sektorer. Det må være enkelt å utveksle data mellom løsninger (både internt i SKE og eksternt med partnere) for å kunne sette de sammen i en større helhet. Funksjonalitet (tjenester) må sees på som en del av tversgående virksomhetsprosesser. Arbeidsprosesser må samordnes for å sikre riktig, enhetlig og oversiktlig behandling.
Underprinsipper, fokus og avgrensning	Prinsippet medfører høye krav til interoperabilitet, som kan beskrives i følgende underprinsipper ift.: Teknisk interoperabilitet: fleksibel, enkel og gjenbrukbar integrasjon mellom løsninger at systemer kan samhandle via felles tekniske standarder felles logisk syn på dataene mellom løsninger og mellom forskjellige tjenester (spesielt i en og samme arbeidsprosess) at tjenester skal tilbys i en forståelig kontekst, dvs. som del av en overordnet arbeidsprosess (hvor dette er naturlig) forståelig sammenheng mellom informasjon, andre tjenester, hvilken kontekst eller prosess den er en del av, brukers muligheter og plikter samt etatens oppgaver. Semantisk interoperabilitet: at systemer forstår betydningen av en tjeneste eller data. Organisatorisk interoperabilitet: at systemer deltar i arbeidsprosesser i en konsistent kontekst. Det kan også innebære organisatoriske endringer for å gi effekt av samhandling.
Konsekvenser
Referanse og relasjon til andre prinsipper	Dette prinsippet tilsvarer Difis prinsipper ”Interoperabilitet” og ”Tjenesteorientering”.
Anbefalinger

Endringskapasitet
Beskrivelse	Alle løsningers funksjonalitet skal være fleksible nok til å kunne følge forventede endringer i samfunnet og i etatens tjenester innen rimelig tid og prisramme.
Motivasjon og mål	Målet med dette prinsippet er å oppnå en høyere fleksibilitet i funksjonaliteten til virksomhetens IT systemer. Samfunn og teknologi er i stadig utvikling og forandring. Det er ikke realistisk at IT-løsninger skal kunne forbli statiske over lang tid, og samtidig kunne oppfylle sine mål på en hensiktmessig måte. Målet for fremtidsrettede løsninger er dermed ikke hovedsaklig å kunne vare "evig", men heller å kunne endres enkelt for raskt å kunne følge utviklingen.
Underprinsipper, fokus og avgrensning	Stor endringskapasitet er spesielt viktig innen rekkefølge og utforming/gjennomføring av forretningsprosessene. rask gjennomføring av regel- og lovendringer. innføre og ta i bruk nye data (oppgaver) innføre og ta i bruk nye IT tjenester. Merk at endringskapasitet ikke bare har med sammensetning av tjenester å gjøre, men også i selve konstruksjonen av tjenester og støttesystemer for disse, samt datamodeller. Avgrensning: Dette prinsippet gjelder de funksjonelle kravene, de ikke-funksjonelle kravene håndteres under Livsløp-prinsippet. Det er ikke viktig å kunne raskt innføre nye forretningsprosesser eller helt nye typer tjenesteområder, da man kan gå ut i fra at Skatteetatens virksomhetsdomene ikke kommer til å endre seg vesentlig i overskuelig framtid. Underprinsipper: Det finnes endringskapasitet på mikro- og makronivå; innenfor en tjeneste eller i ett system, og i forretningsprosessene som bruker tjenester. På mikronivå bør hvert system må foreta en bevisstgjøring rundt hva som må være fleksibelt (unødig fleksibilitet er dyrt). Tilrettelegg for metoder og teknikker som reduserer risiko ved kodeendringer. Tilrettelegg for metoder og teknikker som effektivt bekrefter vellykkede endringer. Skill utvalgskriterier (hvem), forretningslogikk (hva) og lagringslogikk (hvor) slik at det er tydeligere hvor endringer skal implementeres Vær klar over at alle beskrivelser av ett system (kode, spesifikasjon el.) som brukes i en automatisk sammenheng, like mye er del av helheten og kan være kilde til feil. Tydelige ansvarsområder / tjenester gjør at endringer isoleres til delsystemer (og ikke litt her og der) Viktig å raskt kunne prøvekjøre endringer slik at man ser effekten på de dataene som tjenesten omhandler (testing / simulering).
Konsekvenser
Referanse og relasjon til andre prinsipper	Dette prinsippet tilsvarer Difis prinsipper ”Fleksibilitet” og ”Tjenesteorientering”.
Anbefalinger

Gjenbruk
Beskrivelse	Gjenbruk skal prioriteres, både ved å velge allerede gjenbrukbar funksjonalitet ovenfor ny utvikling/anskaffelse og ved å investere i gjenbrukbarhet ved utvikling/anskaffelse av ny funksjonalitet.
Motivasjon og mål	Målet med dette prinsippet er å oppnå høyere kostnadseffektivitet på lang sikt gjennom redusert kompleksitet og mindre redundans som igjen reduserer utviklings-, drift- og forvaltningskost, og muliggjør raskere utvikling. I en typisk prosjektstyrt IT virksomhet som hos SKE vil prosjekter bli målt opp mot leveranser innen tid og budsjett. Motivasjonen for å investere tid og ressurser i løsninger som er gjenbrukbare for andre er dermed ikke høy nok til å sikre gjenbruk uten at dette settes opp som et eget mål.
Underprinsipper, fokus og avgrensning	Prinsippet innebærer følgende underprinsipper: Hvert enkelt prosjekt og videreutviklingsaktivitet skal, i samarbeid med etatens arkitekturmiljø, lage IT-komponenter som andre (både egen etat og andre) kan benytte senere (”bygg eller begrunn”). Hvert enkelt prosjekt og videreutviklingsaktivitet skal, i samarbeid med etatens arkitekturmiljø, gjenbruke IT-komponenter som allerede finnes (internt i etaten eller hos andre offentlige etater) i stedet for å lage/anskaffe nye (”gjenbruk eller begrunn”). Kapsling av systemspesifikk logikk skal gjøres så nært kilden som mulig. Masterdata, dvs. etatens viktigste kjernedata, skal så langt som mulig og hensiktsmessig hentes direkte fra og oppdateres i en opprinnelig kilde som kontrolleres av en eier. Informasjonsmodellen for slike data skal være felles for virksomheten. Gjenbruk skal også gjelde skjermbilder som skal kunne delta i en arbeidsflyt. Brukeren trenger ikke å vite hvilket system som tilbyr ett skjermbilde. Gjenbrukbare tjenester utvikles etter konkrete behov i de systemer som naturlig vil tilby tjenesten. Det er viktig å vurdere de ikke-funksjonelle kravene siden forskjellig bruk av en tjeneste kan kreve forskjellig implementasjon.
Konsekvenser
Referanse og relasjon til andre prinsipper	Dette prinsippet tilsvarer Difis prinsipper ”Fleksibilitet” og ”Tjenesteorientering”.
Anbefalinger

Livsløp
Beskrivelse	Ved utvikling og anskaffelser skal gevinst- og kostnadsbildet ta høyde for helheten i SKEs systemportefølje, hele livsløpsbildet for løsningen, samt de ikke-funksjonelle kravene.
Motivasjon og mål	Målet med dette prinsippet er å få god effekt av IT investeringer over lengre tid. Ofte er tidshorisonten det tas høyde for, for kort. SKEs kjernesystemer har en levetid på flere tiår og valg for disse må dermed ha høyere fokus på fremtidsrettede løsninger. Målet er å kunne frigjøre størsteparten av ressursene og tiden til å gjøre nye ting, og ikke måtte gjøre ting på nytt (fordi løsninger og teknologier har blitt for gamle).
Underprinsipper, fokus og avgrensning	Prinsippet innebærer følgende underprinsipper: Det er meget viktig å satse på åpne og anerkjente standarder, pga. flere tilbydere (unngå avhengighet til leverandør, god tilgang på kompetanse), stadig videreutvikling samt bredt tilbud for overgangsløsninger. Det er spesielt viktig å lagre og håndtere data på formater som er framtidsrettede, da det alltid vil være lettere å oppdatere funksjonalitet enn format på historiske data. Åpen kildekode er også en faktor som bør vurderes. Da man ofte har et aktivt "community" som sørger for videreutvikling samt at man har mulighet til å selv utføre endringer. Alle løsninger skal så langt som mulig og hensiktmessig understøtte: - god og enkel driftbarhet - god og enkel vedlikeholdbarhet Ved valg av løsninger skal alltid: - ferdigløsninger vurderes før utvikling/skreddersøm - outsourcing vurderes ovenfor intern håndtering - monopol og sterk leverandørbinding unngås - moderne og fremtidsrettet, men velprøvd teknologi prioriteres Det er viktig å forstå de ikke-funksjonelle kravene, slike som antall samtidige brukene, datavolum, transaksjonshyppighet, samtidighet (concurrency), sikkerhet, krav til oppetid, krav til recovery / backup, internasjonalisering etc. I dag og i morgen. Disse er viktige for å oppnå en så enkel arkitektur som mulig og man skal ha ett bevisst forhold til hvordan dette utvikler seg over tid.
Konsekvenser
Referanse og relasjon til andre prinsipper	Dette prinsippet tilsvarer Difis prinsipper ”Åpenhet” og ”Skalerbarhet”.
Anbefalinger

Thursday, June 30, 2011

Lifetime principle. Continuous migration

We have a Lifetime principle defined for our Enterprise Architecture.

So what does this principle (and requirement) actually state?

It states that we should choose components and software systems that has the best cost/benefit over its lifetime. That is nice. What we all need. Problem solved, right?
The main problem is that we do not know the answer of either of them; we do not know the lifetime of the technology we choose, and we do not know the lifetime of the business we are supporting. We probably know more about our business overall; tax-systems are here to stay, and certain elements are fixed (someone must pay their tax). But we do not know the details on how our processes will be in 20 years, neither on the details on how to calculate and what type of tax we will be imposing (CO2 footprint tax?).

We must make sure that the important stuff for us - requirements (see: Requirements), data, code base, and business state (as discussed in other blogs) - is handled and stored in a way that makes them versatile. Requirements by themselves are interesting, because the systems in which we store requirements actually must last longer that the systems they define. Just think about it; what system would you trust your requirements? (Systems Architect, Troux, Mega, Qualiware) Which will be there in 20 years?

When we have defined our target architecture it must have certain capabilities to support our domain; but it also must comply to the lifetime principle. Distinct domains define systems that cooperate (they could be called mega bounded contexts, just to extend the Bounded Context of DDD. In our case they would be Party, Tax calculation and Money, which in turn have smaller Bounded Contexts), it is the most important feature; then these can be exchanged as time flies by. Well this is not Lego, we should add "with as little effort as possible". This also helps in isolating custom functionality from COTS. In both cases we find Domain Driven Design to be a great EA approach.

The overall architecture must adapt to continuous change in technology and business. Surely it is a hard nut to crack, but it still is the main goal for the target architecture. It must support continuous change, that is the normal case, we as an organization must support and endorse change.

With the lifetime principle posed on our systems, the overall effect is that:
The architecture must support a condition of continuous change.

Lifetime principle. Continuous migration by Tormod Varhaugvik is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Sunday, May 15, 2011

Steering projects; Where is the improved in "new and improved"?

Enterprise architecture is all about improving, continuously. Our target architecture is our aim, and every step towards it may alter the aim. The new aim is a better aim. That is what it is all about: Improving as we walk to a better architecture. The steps in this walk are our projects. We need to identify, define, implement and steer our projects. We totally depend that a project understand is part of the whole. Every task in the project will go in greater detail. These details will bring us a better understanding; either some part of the goal of the project is a bad path, or it is a good one. Maybe the traversal of the projects tasks really find a golden egg, or a pit of snakes.

Projects live a life of their own. The team is dedicated to their goal, and they will continue to walk within the mandate and quite a bit of self-focus. Very uncommon, or never, has a project said to the steering committee "our task is unnecessary, please stop this project". It is not that the project not is trying to reach its goal, or being loyal, the team is honest in doing what they believe is correct. The purpose of the steering committee is, well, to steer. The steering committee in this context is any entity who is responsible for steering the project. I remember a project manager once said that informing the steering committee was like growing mushrooms "keep them in the dark and feed them horse shit". And maybe this illustrates the challenge; the projects must not obey the project-mandate or architectural blueprints alone.

In this context the mandate must be extended with the target architecture and a description of what the project must participate with. This includes components which are there already - they may be extended - , and new components that later projects will benefit from. The later is a tougher one. It is then vital that projects understand the purpose (why and for what) of this component in the target architecture, and have a steady set of principles. The message must be repeated and repeated, the enterprise architects must be absolutely sure that the architects in the project understand why. Remember that experience the project gain, may eventually alter the target architecture, there must be close cooperation between architects at many different levels.

So this brings me back to the headline: "where is the improved in "new and improved"? When we staff projects, most often we choose people who know the functional domain and someone who know the new technology. And what do you get? To often the existing domain in a new technology. Does this bring us closer to the the target architecture? Can the steering committee see what is coming? Too often they haven´t.

In addition I think the overall process must be agile so that the project mandate and budgeting may be adjusted along the way. Also we have the good opportunity of having project demos as the sprints go by. This makes it easier for the steering committee and the enterprise architects. I think Gartner has coined this overall process EAD "Enterprise-class Agile Development". Using project as changers on the path to an improved enterprise architecture.

All you get is new, but not improved, if you don't make sure that the project does understand the purpose of the target architecture.

Steering projects; Where is the improved in "new and improved"? by Tormod Varhaugvik is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Friday, April 29, 2011

Requirements are long lasting

I have working with large systems and maintenance for a long time, typically within the Enterprise Application domain. I have worked in projects where the code should be "self documenting" and projects that follow RUP diligently. The first one became unstable after a while because of unclear requirements. There was too much code inspection to explain the functional owners what the system did, and there was a problem finding out if features where a bug or not. Also re-factoring was hell. The second one we spent too much time not writing code, and the system really did not reflect the formal requirements. It was just too much documentation to put up with. By experience I think that there must be an information model (and some Glossary or Concept descriptions), description of behavior (eg. Use Cases), and sequence or activity description tying this together. I have also worked with TDD (or similar where tests are written in pair with the code) and seen what a great effect this has on good code modularization.

I have no experience with user stories, but i see and hear (together with BDD and TDD) that it is a good approach to requirements handling that fits the backlog and suitable for the sprints. And together with a suite of functional test, it serves a solid combination for the project delivering an application. My concern is that developing the system is only approx. 10% of the lifetime cost (or less). During the lifetime there will be different system owners and developers and they need to understand how the system should work. There will be reviews and overall planning later on that need to understand the system without looking into the code. In such a context the user stories seem to fragmented and seems more like a changelog for the development, rather than a description of the system.

For the last 90% of the lifetime there is a need for a robust requirements base. How do we ensure that this is maintained? I do believe (from experience), that the requirements must be described when it is developed and be aligned with the version is released. At that time I also think that the functional parts should be a whole more as defined by good Use Case practice. Regardlessly of how you wish to document the requirements, at least kept them in a common and persistent store (not post-it notes), so that there is a stable base for future reference.

And also; it is hard to keep such documentation in place, it certainly requires diligence, obligation and thoroughness to keep up.
In a development team I believe that someone must be responsible for the overall functional package, and that is is not just up to each developer to write user stories. All to often i have seen that functionality is written again without reusing or extending what has already been developed.

Requirements are long lasting by Tormod Varhaugvik is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Thursday, April 14, 2011

New tools, old methods. A masons story

We got all these new tools, but we must also change the way our developers and architects solve problems with it. We got to understand new tools and methods to actually make better software systems. Eg. Classes with 6000 lines, bad separation of concern between layers or components, no clue of MVC, no interfaces, etc.

As an example from constructing a house:
You have descided to construct a house in wood. All your requirements point in that direction. But all you got are masons. How do you think that house would look like without the proper training (or on-site inspection of an existing wooden house)?

A house in wood is constructed very differently than a house in bricks. You first build a frame with beams, then insulate and cover with boards. The mason would probably lay every bean horizontaly with mortar in between. It would surely be a costly house, and would have none of the qualities we originaly specified this to have. Also the masons would think you where an idiot.

New tools, old methods. A masons story by Tormod Varhaugvik is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Tuesday, April 12, 2011

Java is mature. The software-teens are over.

Just a few words on why we would expect less expensive maintenance and better quality in our software systems by using Java and following the standards (de-facto or formal). And also on why I think Java is mature and will not go away for some time.

I would like to compare with the construction of things. Until well into the industrial revolution, every part of something (eg. a clock) was made by hand. And every craftsman had his own approach to this, and it was labor intensive. Screws for example had all sorts of screw threads with different angles, lead, pitch, diameter and handedness. This also meant that if something broke, you had to get that exact copy, probably from the same craftsman (or his son...), no other screw would do.

The same goes with construction of houses. Well into the 50´s there was no obvious standard. Every window in a house was tailor made, they looked alike but had small deviations due to optimizing the usage of materials. The size of a beam was constructed from the "feeling" of the carpenter and the available resources. Standardization has brought many good things into play; training of carpenters, estimation of cost, material and labor, lower prizes due to mass-production; everything is built in 60cm sections.

Now what has this to do with software? Most of today's legacy systems (all the way from the early cuts on software development) where built in a time or with a technology where everything had to be made from scratch. There where virtually no components and there where no good technology for using components. Also training was bad, Computer Science was in its infancy. Every programmer mastered its own discipline, and strange arguments could be heard in the corridors; should arrays start with 0 or 1; big-endian or not, new line before or after comma? These "great" discussions are really a waste of time, the real value is not there. It is in standardization and doing things so that others can do maintenance and understand what the system does. But as with the screws and windows; doing is learning, is was the software-teens we had to get through.

These legacy systems work, but only after a close study, and they will always be hard to maintain, just because everything is tailor-made. If it was a building it would either be condemned, or it would be protected by the antiquarian.

Recently we had a PoC where we found that validating an xml-file (by streaming through it) was just some lines of code. Various legacy versions is much larger. Which do you think gives the most cost-efficient maintenance and ease of change? Which is most robust as to change in who does maintenance on the system? Which one is faster?

...
SchemaFactory schemaFactory =
SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = schemaFactory.newSchema(new File(schemaFile));
Validator validator = schema.newValidator();
validator.validate(new StreamSource(new BufferedInputStream(new
FileInputStream(xmlFile))));
...

I think after approx 60 years of software development and now 15 years of Java (that was 45 years without), we see a very mature language, a great set of standardized technologies, a vast number of components and systems, a large community and supplier base, and a resilient run-time architecture.

I can't see that we could make a better choice.

(We are seeing new efforts into more efficient languages, but for now they are niche players. If a new language should replace Java, it would probably take 10-15 years of maturing before it could actually be there. Just think of all the work put into the Virtual Machine... And as I have argued; the real value is in how we actually construct software systems, not the language by itself)

Java is mature. The software-teens are over by Tormod Varhaugvik is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Saturday, March 5, 2011

Workshop with John Davies

We had John here for 2 days, it was intense and very encouraging. First and foremost we got support for our efforts in the direction as to what stack of technology to use, what development process to follow and how to construct future applications. Secondary we got tips of directions or prioritization to take, that just seems so obvious after the session was over, but that we did not see before we started. Third there are feedback and experiences that we not yet know are relevant or achievable for us.

We started off with some hours of introduction to our domain. It is a steep hill and I am sure we could have used weeks to get more into it. But still I think we managed to get the rough picture so that we got input targeted for our challenges. Compared to what John previously has been working on, I think roughly that we have a lot less transactions, but the diversity in data captured and in regulations put on our systems seems harder.

So to shortly summarize:

Get confidence and commitment from top management (by demoing prototypes).
Build prototypes, have a sandbox, prove concepts. Don´t use valuable staff on reports.
Run agile, have much shorter projects.
Any plan over 2 years is a risk or a waste.
Understand your problem, use what you need, not what a salesperson think you need.
Implement software to run on more that one platform. Put this into the build process, be able to change database, app server or cache at any time.
Don´t use to much time on storage architecture. Build for the cache, concentrate on the business logic.
Integration architecture is much simpler than what´s in an ESB product; Make protocol-neutral services, use a broker and DNS (JNDI).
Standardize and virtualize IT platform.
Use UML in design.
Ruby on Rails is great for GUI.
We have the chance to make a revolutionary tax-system.

Are you up to it? Can you help us implement this?

Tuesday, February 8, 2011

Comment on: RESTful SOA or Domain-Driven Design - A Compromise?

I find this talk by Vaughn Vernon very encouraging and it fits nicely into what we are trying to achieve.
It is very clear and educational and should be made basic training at our site.
http://www.infoq.com/presentations/RESTful-SOA-DDD

It actually fils a gap very nicely as to how our target architecture may be implemented, and more importanlty; why it should behave as as described. The Continual Aggregate Hub and restaurant is really a macro description of how such domain systems may cooperate in a pipeline, and how they can share a document (this is how we store Aggregates) storage.

In in the migrate to the cloud I am talking about ´Cooperating federated systems´; and this is exactly what I observe Vernon is talking about in his presentation. I especially like the Domain Event (page 19 and 34), it is the implementation of what I have called the ´Lifecyle´. Maybe lifecycle is not a good word, but it chosen to illustrate what drives behavior on the domain objects, between and within, these loosely coupled systems (they have separate bounded contexts).

The Domain Event Publisher (page 37) is the Process-state and handling of the CDH, and it is relaxed so that subscribing Modules may consume at a pace they wish. The Event logger is also an important aspect of it so that re-running event is possible. (illustrated nicely at page 39-).

We have very interesting performance results (over 50.000 aggregates pr second) from a Proof of Concept on Continual Aggregate Hub running it in a grid.

(BTW: The real essence is not RESTful or not. Other service implementations and protocols will also do)

Saturday, February 5, 2011

Aggregate persistence for the Enterprise Application

Designing Aggregates (Domain Driven Design) is key to a successful application. This blog explains what the Aggregate and the Root Object is like in our Continual Aggregate Hub.

As I have written in earlier blogs, we are trying to handle our challenge - a large volume, highly flexible, pipeline sort of processing, where we seek to handle quite different sets of information, fix it and calculate some fee - by mixing Domain Driven Design, SOA, Tuple Space, BASE (and others) and a coarse grained document store that contains Aggregates (see previous discussions where we seek to store Aggregates as xml documents).

Known good areas of usage
We know that a document approach is applicable for certain types of challenges; Content Management Systems, Search Engines, Cookie Crunchers, Trading etc.. We also know that documents handle transactions (messages) very nice. But how applicable is it to an Enterprise Application type of system. We want loose coupling between sets of data because it can scale out, functional loose coupling and for other reasons discussed in earlier blogs here.

Why two data structures?
We want systems that are easier to develop and maintain. Today most of Java systems have one structure on the business layer, where we successfully develop code and have a god pace. Using unit tests and mock data to enable fast development. Every thing seems fine, until we have the object relational mapping (ORM). Here we also must model all the same data again, but now in a different structure. At the storage level we put tables and constrains and indexes so that we are sure that the data is consistent. But that also has already been done on the business layer. Why do we continue to do this twice?
The relational model is highly flexible, and is sound and robust. A good reason to use it, is that we want to store data in a bigger context than the business logic did handle. But would´n it be great to relax this layer and trust the business logic instead?

Relational vs. Document
We know that the document approach scale linear very well, and that the relational database does not have the same properties because of ACID and other stuff, but why is it so?

Structure Comparison

The main reason for not being able to scale out, is that data is spread out over many tables (and that is the main structure of most object databases too). Data for all contexts is spread on all tables. Data belong to Party S is all over the place, mixed with Party T. During an insert (or update) concurrency challenges happen at tables A, B and C. The concurrency mechanism must handle continuous resource usage on all tables. No wonder referential integrity is important.
In the document model the objects A, B and C are stored within the document. This means that all data for Party S is in one document and T is in another. No common resource and no concurrency problem.
The document model is not as optimal if there are many usage scenarios that handle all objects C, regardless of what entity it belongs to.

The Enterprise Application challenge
So how do we solve the typical Enterprise Application challenge, with a document store approach? (Should´n we be twice as agile and productive, if we do not need to maintain a separate storage model.) Finding the granularity is important, and most probably should follow the main usage scenarios. To be able to compose aggregates there should be some strong keys that the business logic must ensure referential integrity on. Even though we may not have integrity checks in the storage layer, I am not sure it is that bad. We do validate the documents (xsd and business logic) before we store. And I have no counts on how much bad data I have debugged in databases, even though they have had a lot of schema-enforcement.

The super-document
A lot of the information that we handle in our systems are not part of the domain. There are also intermediate information, historical states and audit for instance. Remember that we in a document approach reverse the concept and store everything about Party S, by itself in its own document. To be able to cope with a document approach, the document itself must be placed within a structure (a super-document) that has more meta data about common concerns such as: keys, process information (just something simple like : new, under construction, accepted), rationale (what decisions did the system do in order to produce this result), anomalies (what errors are there in the aggregate), and audit (who did what, when). The <head> is the Root Object, and its generic so that all documents are referenced in a uniform way. The super-document is structured like this:

<head>
<keys>
<process>
<aggregate>
<rationale>
<anomalies>
<audit>

The IO challenge
We succeed in this only if we also manage to make this perform. The main pitfall for any software project is the time and space dimensions. Your model may look great and your code super, but it does not perform, it does not scale, and you loose. The document storage model is only successful if you manage to reduce IO, both calls and size. If you end up transporting too much information, or if you have too many calls (compared to ORM), then the document model may not be optimal. An Enterprise Application may have 100´s of tables, where probably 30% is m-m relations. I have seen applications with more that 4000 tables... Only a genius or semi-god may manage that. Most probably it will just be unstable for the rest of its life-time (see comment on the Silo). My structure example above is way too simple compared to the real-world. But surely for many of these applications there is a granularity, that fits the usage scenarios better. I have seen documents with 100.000 nodes getting serialized in less than a second. Does not 20 document-types, small and large, seem like a better manageable situation, than 200 tables?

In our upcoming Prof-of-concepts we will be investigating these ideas.

Aggregate persistence for the Enterprise Application by Tormod Varhaugvik is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Tuesday, January 25, 2011

Implementing the CAH

So how do we do this? How do we implement the Continual Aggregate Hub (see also comment on restful soa or domain driven design).

I recall how fast we produced business logic using Smalltalk and object oriented programming with good data structures and data operators. Since then I have not seen much to "Algorithms and data structures". What we all learned in basic IT courses at University, does not seem to be in much use. Did SQL or DTO's ruin it all? Did the Relational normalized data structure put so many constraints that programming was reduced to plain looping, some calculations and moving data to the GUI? Where is the good business model that we can run code on. We need good software craftsmanship.

The basis for good implementation is a good architecture. We need to make some decisions on architecture (and their implementations) for our processing and storage. I would really like to have the same structure in the storage architecture as in the processing layer (no mapping!). It means less maintenance and a more verbose code base. There are so many interesting cases, patterns, and alternative implementations that we are not sure where to start. So maybe you could help us out?

Our strategy for our target environment is about parallel programming and being able to scale out. I find this talk interesting; at least slides 69/35 and the focus on basic algebra properties. http://www.infoq.com/presentations/Thinking-Parallel-Programming I find support here for what we are thinking of; the wriggle room, sequence does not matter. The waiters in the CAH restaurant are Associative and Commutative handling orders. I also agree that programmers should not think about parallel programming, they should think "one-at-a-time". But in designing the system parallel processing should be modeled in and it should be part of the architecture.

It seems like Tuple Space is a right direction, but also here there are different implementations. But what other implementations is there that will be sound and solid enough for us? Several implementations are referenced at (http://blogs.sun.com/arango/entry/coordination_in_parallel_event_based), but which?

For the storage architecture there are also many alternatives. Hadoop with HBASE, or MarkLogic for instance. Or is Hadoop much more. If we can have all storage and processing at every node. How do we manage it? How much logic can be put into the Map-Reduce. What is practical to process before you merge the result?
I just cant let go of feeling that storage structured is within a well known and solid relational database. The real challenge is to think totally different as to how we should handle and process our data. (see document store for enterprise applications) Is it really needed to have a different data structure in the storage architecture? Maybe I am feeling like waking up from a bad dream.

In the CAH blogg we want to store the Aggregates as they are. I think we not need different data structure in processing architecture (layer) and the storage architecture.

(2013.10.30): It is now implemented: big data advisory definite content

Implementing the CAH by Tormod Varhaugvik is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Thursday, January 13, 2011

Migration strategy and the "cloud"

Our background
We have a classic IT-landscape with different systems built over the last 30 years, large and small "silos" - mostly serving is own business unit - and file based integration between these. Mainly we have Cobol/DB2 and Oracle/PL-SQL/Forms.

We are challenged by a quite classic situation. How do we serve the public, how do we get a holistic view on the entities we are handling, how do we handle events in more "real time", how do we make maintenance less expensive, how do we get much more responsive to change, how do we ensure that our systems and architecture are maintainable in 2040? Every business which is not greenfield, is in this situation.

We are working with Enterprise Architecture and using TOGAF. Within this we have defined a target architecture (I have written about core elements of this in Concept for a aggregate store and processing architecture), and are about to describe a road map for the next 5-10 years.

Content:

What's this "silo"?
What do we expect of the cloud?
Loose coupling wanted
Cooperating federated systems
Complexity vs uptime
Target architecture main processes

What's this "Silo"?

Don't feed the silo! The main classifier for the silo is that it is to big to manage and that it is not very cooperative. It is very self-centered and wants to have every data and functionality for itself. By not feeding it, I mean that it is so tempting to put just another functionality onto it, because so much is already there. But because so much is intermingled, consequences of change is tough to fore-see. What you have is a system really no-one understand, it takes ages to add new functionality, probably you never get it stable and testing very costly (compared to the change). In many ways the silo is this guy from Monthy Pyton exploding from a "last chockolate mint".
Typically these silos each have a subset of information and functionality that affects the persons and companies that we handle, but getting the overall view is really hard. Putting a classic SOA strategy on top of this is a catastrophe.
Size is not the main classifier though. Some problems are hard and large. We will have large systems just because we have large amounts of state and complex functionality.

What do we expect of the cloud?

Cloud container

First and foremost it is an architectural paradigm describing a platform for massive parallel processing. We need to start building functionality within a container that lest us have freedom in deploying, separate business functionality for technical concerns or constraints. We need an elastic computing platform, and start to construct applications that scale "out of the box" (horizontal scaling). We will focus on IaaS and PaaS.
But not all problems or systems gain from running in the cloud. Most of systems built to this day, simply does not run in the cloud. Also data should not "leave our walls", but we can always set up our own or have a more national government approach.

Divide and conquer

Modules and independent state

No-one understand the silo itself, and it's not any easier to understand it when the total processing consist of more than one silo. The problem must be decomposed into modules and these modules must have loose coupling, and have separate concerns. We find Domain Driven Design to be helpful. But the challenge is more than just a functional one, there are also both technical and organizational requirements that put constraints on what modules are actually need. Remember that the goal is to have an systems environment which is cheaper to maintain and is easier to change as requirements change. The classical SOA approach oversees the importance of state. No process can function without it. So putting a SOA strategy (implementing a new integration like Web-Service / BPEL like system) on top of silos that already had a difficult maintenance situation, is in no way makings things simpler. The total problem must be understood. Divide and conquer! Gartner calls this application overhaul.
The organization that maintains and runs a large system must understand how their system is being used. They must understand the services other depend upon and what SLA's are put on them. Watch out for a strategy where some project "integrates" to these systems, without embedding the system's organization or the system itself. Release handing and stability will not gain from this "make minor" approach. The service is just the tip of the iceberg.
A silo is often a result of unilateral focus on organization. The business unit deploys a solution for its own business processes, and overseeing reuse and the greater business process itself. Dividing such a silo is a lot about governance.
Also you will see that different modules will have different technical requirements. Therefore there may be different IT-architecture for the different systems.
When a module is master in its domain, you must understand the algorithms and the data. If they have independent behavior (can be sharded), it can be paralleled and run in the cloud. In the example the blue module has independent information element. It will probably gain from running in the cloud, but must still cooperate with the yellow and green module.

Pages