14 Best Practices to Develop the Best Data Integration Platform
Well, integrating two or more applications together so that one application can send / receive data from another is the most important consideration for any organization. With varied amount of data specific logics, it is often cumbersome to define a proper algorithm to integrate your business processes. To deal with such scenarios, you might have kept a number of technical team in your premise to deal with common pitfalls during the day to day operation of integration. In this post, I am going to cover some of the problems that you might incur while developing an integration solution, and ways to handle them efficiently.
Here are some of the points which you should look at while developing integration solution:
- Separation of Concerns: Even though it is quite easy to develop a solution on top of an existing platform, it is not recommended to do this. Let’s say, you have a back-end ERP system which wants to send some real-time updates to the eCommerce applications such that when you change prices on the ERP system or put some discount on the products, it will update the price of the eCommerce systems. In such a scenario, you can either do the logical data transformation in the on premise ERP system and the updates trigger a process which will update the other side of the application. That means you are providing every logic to integrate the application inside the application.
Now say you have a new upgrade to the ERP and you need to change your customization in such a way that the integration between the other application(s) remains good. This would be a havoc task. As your business grows and you add up more and more sites, the complexity of handling everything grows exponentially.
Thus there is a need of creating the integration project separate such that the all complexity is separately handled. In the above snapshot, you can clearly see, that all your applications when connected to common platform which handles all integration bases, will help immensely.
- Scalability of the integration is primal
Cloud is the most happening technology nowadays. Infinite scaling an application is one of the major requirement of an integration platform. You as a customer never know when your data created on the application rises and when it falls. Adding scalability as a feature to your integration platform gives you an extra edge to deal with performance bottlenecks easily and efficiently.Moreover, as it is stated that Integration platform runs best when all of the servers communicating is placed on the same network (same rackserver or same local network), with added scalability to on premise solution or in cloud will give you an extra edge. Some of the vendors do not like the data to go to cloud, in such a case, if the support of scalability is provided to an on premise solution, it will best suit to their needs. For instance, if you keep your Source and Target at least on the same local network (if not on the same machine), your integration solution would perform the best.
- Process Chaining and Orchestration Engine:
Process Chaining and dependency identification is one of the important concern nowadays. Business processes are nowhere independent and to integrate a business process you need to chain dependency such that the whole workflow executes as a single unit.
Other than process chaining, it is also important to identify the dependency type such that the process can improve performance and reuse of data. Some of the chaining types that could be identified are:
- Independent tasks – Pre
- Independent tasks – Post
- Independent tasks – In process
- Shared Input
- Shared output
When a chaining allows people to define dependency type, it could mark more as an integration platform.
Ideally, there is also a concept of Orchestration engine where the type of chaining is determined automatically and based on the data coming from actual workflow task, it will invoke the tasks dependent on each other. When automated orchestration is established, your implementation is said to be fully optimal.
- Service Oriented Architecture
SOA is one of the common and widely accepted technique of data delivery. Well, while defining your own architecture, it is important to follow standard SOA techniques such that the data transferred from one application to another follows a universal pattern and any 3rd party can plugin to it. With SOA implemented on top of your integration platform benefits most of the existing applications which you use often to integrate easily delivering goals on your solution.
The above image shows how a standard SOA looks like, where consumer requests for some data and the server delivering the data on request without maintaining state on the server.
The integration platform can deliver certain stats to the outside world and with the SOA architecture inbuilt on the platform will enable it to easily integrate to any other external application and giving business intelligence. With open ended architecture it will also give you less platform dependency etc.
- Concept Enterprise Service Bus (ESB) or Enterprise Application Integration (EAI)
We are hearing a common buzzword recently about Enterprise Service Bus and Enterprise Application integration. By far if you think of open ended SOA integration with high scalability, enterprise service bus is one of the architectural pattern you must follow.
Enterprise service bus is a concept on mutual communication between two or more parties with some common duties in mind:
- Self-monitored control of messages exchanged between communicating parties.
- Distributing information across intended parties quickly and easily.
- Use of common protocols.
- Retain messages when intended parties are offline or not available for message consumption.
- Changing, re-routing, logging information can be made anytime without changing the implementation of Service Bus.
- Providing incremental solutions to the problem.
A Service Bus which ensures message delivery is indeed a great addition to improve security and trust of applications in integration.
- Concept of Micro services and its utilization to an integration platform
In a monolithic architecture, all the platforms in a connected system are dependent on each other in terms of data, processing, error handling, user interfaces etc. Hence when any problem occurs in one system, it automatically gets notified to the others and thereby affecting the dependent processes. Micro services are small independent unit processes which are independent from one another and can execute self-sufficiently, yet can communicate using language – agnostic APIs. The decoupled architecture helps in scaling processes individually and hence you can enforce the most critical business process can have your best resources available.
With micro-services, you are ensuring that the critical business process affected from an integration remains unhampered by other dependencies and also you can individually optimize the integration endpoints based on your own need.
Micro services also allow continuous delivery and any small changes on the small part requires small number of services to be rebuilt and have no impact on independent services. Hence for an integration platform implementing micro services is key.
- Notification hubs and/or Standard notification framework
Notifying an integration issue is one of the key criteria of a good integration platform. The best integration platform should inform the most minute details of the activities it is doing during its execution and anyone subscribed to this level, should be notified accordingly. Most of the software companies build notification hubs which aids in generalizing notification processes. This is one of the most crucial technique to consider while creating your integration platform.
Ideally, there should be one generalized notification component which interacts with the standard modules in a standard protocol and also has open ended API to access the notification hubs from external world. The notification hubs should also have some subscribers and based on the subscription, one can send / receive messages. Providing larger number of devices and greater number of medium to send notifications (For instance, Emails, SMS, Push notifications etc.) is key.
- Real-time synchronization
Sometimes the time elapsed to sync data between the application is critical. In such scenarios, Real-time sync of data is an important consideration. Real- time synchronization supports push based transaction rather than schedule based. Most of the applications supports Web Hooks or Customizations. So either way, you can configure the application to send the data notified to its subscribers in such a way that when data is entered into the application, it will be pushed to the integration platform for synchronizations.
Many integration platforms create a web server inbuilt on the application itself, thus ensuring a public IP or local IP or DNS associated with that machine for a specific port will ensure your application even if it is external can receive data notification directly whenever the data is put in the application itself.
- Process Scheduling to activate synchronization
Scheduler is another important feature to consider while developing an architecture. The more enhanced the scheduler you build; the more extensive connector you are in. In a complex business process, it is quite evident that no individual process can get data at the same time optimally, so optimization of your scheduler is one of the most critical implementation procedure for any integration platform.
Some of the synchronization support that an integration platform should provide:
- Recurrence schedule for hourly and minutely.
- Daily schedule for specific time.
- Looped schedule.
- Weekly schedule at specific time of day.
Ideally, if an integration platform follows standardized ICalendar rules, the schedules can be integrated anywhere.
- Failure detection and fault tolerance measures
Some of the best integration platform should focus more on how transient failures to the integration is handled. For a long running process, it is likely to be a case where you are experiencing failures due to unavailability of dependent applications and / or actual integration platform. The smarter handling of failed data is a required such that the business does not get hampered due to the problems in the connector itself. Let us take an example, let’s say you have built a connector for Stock updates which keeps track of stocks and inventory when order is placed, and your order is placed from say 3 marketplaces and one eCommerce. Now let’s say for 10 min your connector is unavailable. In such a scenario, you can have three kind of problems:
- Your order is placed more than actual inventory you have.
- Disparity of actual stock and inventory on all the applications.
- You don’t know which are the orders which you cannot ship.
For an ideal connector, there should be a facility of detecting all the failed data that cannot be synched, and must have a rule based adjustments which can ensure that when connectivity is established, you can dispatch an action to Resync all the failed entries and update the same to all other connected application, and also notify the application to trigger emails on orders which you cannot ship.
- Rule based logic evaluation and action dispatcher
Sometimes, getting business out of an integration is also important. Even though the main benefit of the application is to ensure the data is transferred from one application to another smoothly but a decision making engine wrapped over your data to keep an eye on the data being transferred sometimes becomes very handy. As mentioned in the problem of step 10, you can add an action dispatcher to automatically trigger a process which can trigger some action on the application which have some kind of failed data and to take automatic decision.
On other hand, if say, you want to keep track of a critical order in B2B which you are expecting, and you want to get notified about the order immediately, rule based action dispatcher will enable you to define such rules.
Rule engine runs on either the output of the data or on the input based on types of rules and takes action accordingly to what setup from server. Say yes to rule engines.
- Parallel computing and multiple agents
Parallel computing is another best bet on your integration platform. It is quite evident for any software integrators, the challenge is to execute and synchronize data between two applications as quickly as possible. Increased volume of transaction and load on data transformation sometimes makes it hard to scale up vertically on the same box. While building your integration platform, one of the important thing that you need to consider is how you can support on parallel execution such that you can scale up your integration horizontally. getting more hits on certain kind of data than the other. So while updating a stock which might take huge amount of resources, you want another process to run in parallel.
To scale an implementation horizontally, we can group configurations for one agent performing a group of tasks, and another group is performed using another agent. With horizontal scaling you can also ensure that the agents are running in parallel with separate hardware resources specific to its own sandbox.
- Open ended integration platform
Final problem to consider while creating an integration platform is to make it open ended. When your integration platform is open-ended, more and more applications can easily plugged in and work together. Making a platform open ended is a challenge. Generally, all the components that you develop should individually support pluggable interfaces. There should be enough hooks to perform certain optimizations such that anyone implementing your platform for a client can easily customize/adjust/optimize the implementation by even writing codes if required.
For web, you can provide REST or any open protocol based APIs with standard authentication such that platforms built using standard protocol can talk to your servers and customize.
For agents, there should be a standard set of APIs available for standard languages and rules of plugins such that anyone can easily understand the API and plugin to certain exposed APIs on the agents.
For messaging, we should also keep standard protocols available such that anyone having secret key and authorization can easily pass messages to your account.
- Generic helpers and availability of SDK
When you are building an integration platform, you are building a software where other highly skilled software implementer’s can work on. Making their life easier is one of the important consideration.
As a platform for developers, an integration platform should provide proper help and support section for its SDK and also provide code snippets and reusable components for developers such that they can tweak to perform their own activities. Cloud based configuration definition and easy SDK for building solutions for ISV will give edge on the competitive market of integration solution.
Architecture design is the key to any software platform. Better architecture during the inception can help in saving long running issues. Advanced features like Hybrid integration model, failure detection, rule based action execution, advanced scheduling, process chaining, workflow management etc. as mentioned can give better benefit for an integration platform. We hope this post is helpful, do let us know your thoughts on the topic.