John Schuster, vice president of engineering at SnapLogic, explains to IT Business Edge's Loraine Lawson how he sees both cloud and Big Data putting new demands on ETL and EAI integration approaches.
“We have the ability to deploy SnapLogic in the cloud, but connect to data sources that are on-premise behind a customer firewall. This is really important because a lot of customers want to build pipelines that connect sources and destinations that are in different security zones ... ”
- John Schuster
- VP of
Engineering
SnapLogic
Lawson: What's the basis for your integration work? Is it ETL?
Schuster: Actually, we find ourselves answering questions about what we are quite a bit because in cloud integration, ETL and EAI are not quite as clear. Typically in cloud integration there’s a little bit of both to get your job done.
We really see ourselves as enabling our customers to integrate their cloud applications. That requires some amount of ETL-like data movement and transformations, but certainly also requires EAI-like functionality to interface with the different applications that might be data sources or data destinations.
Beyond that, we also see that Big Data and cloud are in fact changing the landscape quite a bit. In some ways, you can think about it as Big Data pushing on ETL and cloud pushing on EAI and creating this gap in market opportunity that we’re trying to basically fill the void by providing the right set of features and functionality to allow customers to get their job done.
Lawson: Why do you say cloud is “pushing” EAI?
Schuster: It’s making demands on EAI products. The number and diversity of applications and data formats is drastically increasing over time. If you think about products that were designed 10 years ago or longer, in the '90s, they're really designed for a smaller number of applications, typically running on a LAN or a fast, reliable network. When the number increases and the networks become less reliable and the cloud service providers become much more reliable, it actually changes the product requirements on EAI.
Anytime you kind of have an order of magnitude increase in the number of things that you're processing or the amount of data that you're processing, it really changes how you architect products. For example, if you're designing a system to be reliable, you might make the assumption that the network has low latency and is fast, but actually, if you take the product that you built for that and put it on the Internet and tell it to transfer data between Salesforce and SAP behind a firewall, it might not work out the way you hoped. You might need a product that understands how to recover from failures, how to deal with high latency, how to retry in the event of failure. That would be an example of how I see cloud changing EAI landscape.
Lawson: How is Big Data changing ETL?
Schuster: Again, when you designed a system to move data from point A to point B and transform it, you might have made assumptions like, for example, that you can actually process all that data on a single server.
In fact, with Big Data, that’s no
longer the case. When you need to process say a terabyte or
10 terabytes of data instead of a sizable chunk like maybe
100 GBs, you need a different product. You need a Big Data
processing engine like Hadoop has Hadoop Cluster and you
need a product that knows how to interface with that. Again,
when things tend to grow by orders of magnitude, you really
need to design differently.>>
You can find this
interview at: http://www.itbusinessedge.com/cm/community/features/interviews/blog/snaplogic-uses-rest-based-approach-to-cloud-and-on-premise-integration/?cs=50072&utm_source=itbe&utm_medium=email&utm_campaign=EEB&nr=EEB
Gervas
