<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Hadapt</title>
	<atom:link href="http://hadapt.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://hadapt.com</link>
	<description>The greatest thing to happen to Hadoop since Hadoop</description>
	<lastBuildDate>Mon, 17 Jun 2013 23:29:38 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	
		<item>
		<title>Classifying Today&#8217;s &#8220;Big Data Innovators&#8221;</title>
		<link>http://hadapt.com/classifying-todays-big-data-innovators/</link>
		<comments>http://hadapt.com/classifying-todays-big-data-innovators/#comments</comments>
		<pubDate>Fri, 21 Dec 2012 14:57:44 +0000</pubDate>
		<dc:creator>Daniel Abadi</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://hadapt.com/?p=1642</guid>
		<description><![CDATA[<p>Last week, InformationWeek published a piece, authored by Doug Henschen, that listed 13 innovative Big Data vendors. The complete list is reproduced below: 1.  MongoDB 2.  Amazon (Redshift, EMR, DynamoDB) 3.  Cloudera (CDH, Impala) 4.  Couchbase 5.  Datameer 6.  Datastax &#8230; <a href="http://hadapt.com/classifying-todays-big-data-innovators/">Continued</a></p><p>The post <a href="http://hadapt.com/classifying-todays-big-data-innovators/">Classifying Today&#8217;s &#8220;Big Data Innovators&#8221;</a> appeared first on <a href="http://hadapt.com">Hadapt</a>.</p><img src="http://track.hubspot.com/__ptq.gif?a=219713&k=14&bu=http%3A%2F%2Fhadapt.com%2Fblog%2F&r=http%3A%2F%2Fhadapt.com%2Fclassifying-todays-big-data-innovators%2F&bvt=rss&p=wordpress" style="float:left;" xml:base="http://hadapt.com/feed/" width="1" height="1" border="0" align="right"/>]]></description>
			<content:encoded><![CDATA[<p style="text-align: left;" align="center">Last week, InformationWeek <a href="http://www.informationweek.com/big-data/slideshows/software/information-management/13-big-data-vendors-to-watch-in-2013/240144124">published a piece</a>, authored by Doug Henschen, that listed 13 innovative Big Data vendors. The complete list is reproduced below:</p>
<p>1.  MongoDB<br />
2.  Amazon (Redshift, EMR, DynamoDB)<br />
3.  Cloudera (CDH, Impala)<br />
4.  Couchbase<br />
5.  Datameer<br />
6.  Datastax<br />
7.  Hadapt<br />
8.  Hortonworks<br />
9.  Karmasphere<br />
10.  MapR<br />
11.  Neo Technology<br />
12.  Platfora<br />
13.  Splunk</p>
<p>These 13 vendors distribute 16 unique data management products (since both Amazon and Cloudera offer multiple distinct data management/processing systems), all of which push the boundary on Big Data management.</p>
<p>In this post I will attempt to subcategorize these 16 products into a competitive grouping, where products placed inside the same group can be considered replacements for each other (and hence are competitive), and each group is complementary to every other group.</p>
<p>Before starting this classification, I will remove three products that, while potentially being interesting from a Big Data perspective, are often used outside of what has become known as the “Big Data realm”, and therefore their primary competitors did not make it on the InformationWeek list. These three products are Splunk (which typically competes with companies focused on the security, compliance, and IT operations management verticals), Amazon Redshift (which typically completes with traditional MPP database vendors), and Neo Technology (which, although usually classified as a “NoSQL database”, its focus on graph data makes it highly unique from a technology and use case perspective relative to the other NoSQL databases on this list).</p>
<p>The remaining 13 products can be classified into four distinct groups:<br />
1.  Operational data stores that allow flexible schemas<br />
2.  Hadoop distributions<br />
3.  Real-time Hadoop-based analytical platforms<br />
4.  Hadoop-based BI solutions</p>
<p><strong><span style="text-decoration: underline;">Group 1 (operational data stores that allow flexible schemas)<br />
</span></strong>This group is composed of database products that can be used to manage active data for dynamic applications with hard to define (or hard to predict) schemas. The database must be optimized for inserting, retrieving, updating, or deleting individual data items in real-time (latencies on the order of milliseconds), but should also support some sort of interface for performing analysis of the data stored within. The dynamic nature of the typical use case for databases in this group implies a NoSQL interface, and either a key-value or document-store retrieval model. From the InformationWeek list, MongoDB, DynamoDB, Couchbase, and Datastax all fit in this category. Although there are some significant technical differences between these products, they can nonetheless be roughly described as potential replacements for each other in Group 1 use cases.</p>
<p><strong><span style="text-decoration: underline;">Group 2 (Hadoop distributions)<br />
</span></strong>The products in this group are designed for very different situations than Group 1. Hadoop is typically used for large scale data analysis and batch processing. Rather than inserting, retrieving, updating, or deleting individual data items, Hadoop is optimized for scanning through large swaths of data, processing and analyzing the data as it proceeds. Hadoop has become the poster-child for “Big Data” due to its proven massive scalability, and its ability to handle the “variety” aspect of Big Data (since Hadoop does not require data to fit neatly into rows and columns in order to be analyzed and processed). From the InformationWeek list, Cloudera, Hortonworks, MapR, and Amazon EMR all fit in this category.</p>
<p><strong><span style="text-decoration: underline;">Group 3 (real-time Hadoop-based analytical platforms)<br />
</span></strong>Group 3 takes Hadoop to the next level, transforming it from a mere batch processing system to a full-fledged analytical platform that can answer queries in real-time. Furthermore, by adding a more robust SQL interface to Hadoop (in addition to industry-standard ODBC connectors), group 3 products help to hide the complexity of Hadoop and the need for Hadoop specialists, since traditional business intelligence and visualization tools are now able to interface directly with data stored inside Hadoop. From the InformationWeek list, Hadapt clearly fits in this category, and with certain caveats, so does Cloudera Impala (the caveats are that as of the time of writing this blog post (a) Impala is an extremely young codebase and is still only in beta (b) Impala only supports a small subset of SQL and does not support UDFs or other ways to combine structured and unstructured data in the same query, so calling it an “analytical platform” might be a bit of a stretch).</p>
<p><strong><span style="text-decoration: underline;">Group 4 (Hadoop-based BI solutions)<br />
</span></strong>Often lumped together with group 3 products,  group 4 products are often confused as being competitive with group 3 products. However, just as business intelligence tools and analytical database solutions are highly complementary and were often packaged together in the pre-Hadoop world, the same is true in the Hadoop/Big Data world. Therefore, Datameer, Karmasphere, and Platfora, all of which function as a business intelligence layer above Hadoop, are capable of working closely with the group 3 products (with announcements along these lines already starting to begin).</p>
<p>In conclusion, although “Big Data” is an enormous and rapidly growing market, one single data management software product is not going to rule the market. Rather, there are four major groups of data management solutions within the Big Data space; and while there is fierce competition within each group, at the macro level these groups can not only co-exist, but are highly complementary. In the long run, it is likely that the 2-3 leaders in each group will emerge and share the Big Data pie.</p>
<p>The post <a href="http://hadapt.com/classifying-todays-big-data-innovators/">Classifying Today&#8217;s &#8220;Big Data Innovators&#8221;</a> appeared first on <a href="http://hadapt.com">Hadapt</a>.</p><img src="http://track.hubspot.com/__ptq.gif?a=219713&k=14&bu=http%3A%2F%2Fhadapt.com%2Fblog%2F&r=http%3A%2F%2Fhadapt.com%2Fclassifying-todays-big-data-innovators%2F&bvt=rss&p=wordpress" style="float:left;" xml:base="http://hadapt.com/feed/" width="1" height="1" border="0" align="right"/>]]></content:encoded>
			<wfw:commentRss>http://hadapt.com/classifying-todays-big-data-innovators/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Hadapt Adds $6.75MM in Funding</title>
		<link>http://hadapt.com/hadapt-adds-6-5mm-in-funding/</link>
		<comments>http://hadapt.com/hadapt-adds-6-5mm-in-funding/#comments</comments>
		<pubDate>Mon, 12 Nov 2012 15:08:02 +0000</pubDate>
		<dc:creator>Justin Borgman</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://hadapt.com/?p=1366</guid>
		<description><![CDATA[<p>As reported in the news on the 8th, Hadapt has added $6.75MM to the war chest.  We’ll use the additional capital primarily to accelerate our go to market strategy and investment in the intellectual property that makes Hadapt so unique. &#8230; <a href="http://hadapt.com/hadapt-adds-6-5mm-in-funding/">Continued</a></p><p>The post <a href="http://hadapt.com/hadapt-adds-6-5mm-in-funding/">Hadapt Adds $6.75MM in Funding</a> appeared first on <a href="http://hadapt.com">Hadapt</a>.</p><img src="http://track.hubspot.com/__ptq.gif?a=219713&k=14&bu=http%3A%2F%2Fhadapt.com%2Fblog%2F&r=http%3A%2F%2Fhadapt.com%2Fhadapt-adds-6-5mm-in-funding%2F&bvt=rss&p=wordpress" style="float:left;" xml:base="http://hadapt.com/feed/" width="1" height="1" border="0" align="right"/>]]></description>
			<content:encoded><![CDATA[<p>As reported in the <a href="http://www.xconomy.com/boston/2012/11/08/big-data-startup-hadapt-now-backed-by-6-75m-from-atlas/">news</a> on the 8th, Hadapt has added $6.75MM to the war chest.  We’ll use the additional capital primarily to accelerate our go to market strategy and investment in the intellectual property that makes Hadapt so unique.</p>
<p>Following a strong customer response to our GA launch earlier this year and a $9.5MM Series A financing that took place in late 2011, we actually had no need for additional capital.  And that’s exactly what I said when Atlas Venture initially approached us with an unsolicited term sheet.</p>
<p>However, as any good entrepreneur knows, the best time to raise money is when you don’t need it.  And that was certainly a factor here.  With a strong increase in valuation, the additional money allows founders, employees, and existing investors to keep smiling about the value of their shares.</p>
<p>Furthermore, it enabled us to add Atlas Venture to our investor group.  As many of you already know, Chris Lynch has been the Chairman of our Board since April and has more recently become an investor at Atlas.  He is understandably excited about Hadapt&#8217;s increasing impact within the Big Data market and the growing momentum around our <a href="http://hadapt.com/news/hadapt-accelerates-hadoops-move-to-production-with-interactive-applications/">v2.0 announcement</a>.  We’re equally excited to have him even more personally invested in our success.</p>
<p>The post <a href="http://hadapt.com/hadapt-adds-6-5mm-in-funding/">Hadapt Adds $6.75MM in Funding</a> appeared first on <a href="http://hadapt.com">Hadapt</a>.</p><img src="http://track.hubspot.com/__ptq.gif?a=219713&k=14&bu=http%3A%2F%2Fhadapt.com%2Fblog%2F&r=http%3A%2F%2Fhadapt.com%2Fhadapt-adds-6-5mm-in-funding%2F&bvt=rss&p=wordpress" style="float:left;" xml:base="http://hadapt.com/feed/" width="1" height="1" border="0" align="right"/>]]></content:encoded>
			<wfw:commentRss>http://hadapt.com/hadapt-adds-6-5mm-in-funding/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Importance of Interactive SQL on Hadoop</title>
		<link>http://hadapt.com/the-importance-of-interactive-sql-on-hadoop/</link>
		<comments>http://hadapt.com/the-importance-of-interactive-sql-on-hadoop/#comments</comments>
		<pubDate>Wed, 17 Oct 2012 23:03:28 +0000</pubDate>
		<dc:creator>Daniel Abadi</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://hadapt.com/?p=1109</guid>
		<description><![CDATA[<p>As most readers of this blog are already aware, today we announced Hadapt 2.0 that includes several major upgrades to our software, including “interactive” SQL on Hadoop and a new Hadapt Development Kit (HDK) that greatly expands the variety and &#8230; <a href="http://hadapt.com/the-importance-of-interactive-sql-on-hadoop/">Continued</a></p><p>The post <a href="http://hadapt.com/the-importance-of-interactive-sql-on-hadoop/">The Importance of Interactive SQL on Hadoop</a> appeared first on <a href="http://hadapt.com">Hadapt</a>.</p><img src="http://track.hubspot.com/__ptq.gif?a=219713&k=14&bu=http%3A%2F%2Fhadapt.com%2Fblog%2F&r=http%3A%2F%2Fhadapt.com%2Fthe-importance-of-interactive-sql-on-hadoop%2F&bvt=rss&p=wordpress" style="float:left;" xml:base="http://hadapt.com/feed/" width="1" height="1" border="0" align="right"/>]]></description>
			<content:encoded><![CDATA[<p>As most readers of this blog are already aware, today we announced Hadapt 2.0 that includes several major upgrades to our software, including “interactive” SQL on Hadoop and a new Hadapt Development Kit (HDK) that greatly expands the variety and reusability of analytical applications that can be built on top of Hadapt. Curt Monash (<a href="http://bit.ly/Hadapt20DBMS2">http://bit.ly/Hadapt20DBMS2</a>) and Derrick Harris (<a href="http://bit.ly/Hadapt20GigaOM">http://bit.ly/Hadapt20GigaOM</a>) have already posted some nice commentary and further details about Hadapt 2.0. In this post I want to expand on the last paragraph of Derrick’s post, which includes a quote from me highlighting the interactive query part of Hadapt 2.0, the feasibility of building this inside Hadoop, and Hadapt’s historical mission to achieve this.</p>
<p>Hadoop started out as an open source effort to replicate the system described in the MapReduce research paper that was published by Google in 2004. It started gaining steam in 2006 and finally got adopted by several major Web enterprises for use in production in 2008. By 2009 it became clear that Hadoop was going to be a major force to be reckoned with for processing unstructured data. Between then and now, just about everybody in the industry has agreed that Hadoop and database systems were perfectly complementary; Hadoop can be used for processing unstructured data, ETL-style transformations, and one-off data processing jobs, while database systems can be used for fast SQL access to structured data. Data can be shipped between Hadoop and relational database systems over a connector. For example, a Hadoop job can be run to structure the data, after which it is sent to a relational database system (which may be bundled together with Hadoop in the same cluster “appliance”) where it can be queried using SQL.</p>
<p>Since the vast majority of the world has spent the past 4 years agreeing that MapReduce and database systems are complementary, very few people perceived the need for high performance SQL on Hadoop. If database systems and Hadoop are going to be deployed side by side (for example, in an appliance that includes “Hadoop nodes” and “database nodes”), it is totally redundant to give Hadoop a high quality SQL interface, since the complementary database system can be used for SQL access. Therefore projects like Hive have languished in mediocrity, with far fewer active developers than other more strategic elements of the Hadoop ecosystem (such as HDFS).</p>
<p>Contrary to conventional wisdom – we believed differently, and for the last 4 years we have been espousing a contradictory vision. Instead of viewing Hadoop and database systems as complimentary, we have viewed them as competitive, and have championed the idea of bringing high performance SQL to Hadoop in order to create a single system that can handle both structured and unstructured data processing. In 2008 we started building a system called HadoopDB that does exactly this, and by March 2009 we completed our initial prototype and submitted our work to VLDB. The work was accepted and published at VLDB, and we founded Hadapt shortly afterwards (in 2010) to productize this defiant vision.</p>
<p>Over the past several years, we have been laser-focused on turning Hadoop into an all-purpose analytical platform for both unstructured and structured data while providing high performance SQL access to it. We have worked hard to get high performance for joins, improving optimization and scheduling of SQL queries, and delivering good performance on complex, ad-hoc, data warehousing-style queries. With Hadapt 2.0, we have even managed to remove the Hadoop start-up overhead for the shorter, simpler queries, so that these queries can run in less than a second.</p>
<p>Despite our focus from the beginning on bringing high performance SQL to Hadoop, only now are we willing to call ourselves “interactive”. To us, interactivity implies a truly fluid and engaging experience for the user with the system. It must include both of the following characteristics:</p>
<ol>
<li>Simple queries that involve selections, projections, and aggregations should be measured in milliseconds</li>
<li>More complex ad-hoc queries that may involve multiple joins should be done without the user having to do something else while the query runs in the background.</li>
</ol>
<p>Building a truly interactive system that includes both of the above characteristics is highly nontrivial &#8212; our robust foundation began with adding fundamental relational database technology to Hadoop; had we not started working on this 4 years ago, and focused our entire engineering efforts on bringing relational database technology to Hadoop, we wouldn’t be able to offer anything near the quality of the software which is in Hadapt 2.0. As customers increasingly demand a single unified system for multi-structured analytics as opposed to multiple systems with connectors between them, Hadapt is the leading innovator and extremely well positioned to meet these customer demands.</p>
<p>The post <a href="http://hadapt.com/the-importance-of-interactive-sql-on-hadoop/">The Importance of Interactive SQL on Hadoop</a> appeared first on <a href="http://hadapt.com">Hadapt</a>.</p><img src="http://track.hubspot.com/__ptq.gif?a=219713&k=14&bu=http%3A%2F%2Fhadapt.com%2Fblog%2F&r=http%3A%2F%2Fhadapt.com%2Fthe-importance-of-interactive-sql-on-hadoop%2F&bvt=rss&p=wordpress" style="float:left;" xml:base="http://hadapt.com/feed/" width="1" height="1" border="0" align="right"/>]]></content:encoded>
			<wfw:commentRss>http://hadapt.com/the-importance-of-interactive-sql-on-hadoop/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Hadapt Accelerates Hadoop’s Move to Production with Interactive Applications</title>
		<link>http://hadapt.com/hadapt-accelerates-hadoops-move-to-production-with-interactive-applications/</link>
		<comments>http://hadapt.com/hadapt-accelerates-hadoops-move-to-production-with-interactive-applications/#comments</comments>
		<pubDate>Tue, 16 Oct 2012 23:02:15 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://hadapt.com/?p=1107</guid>
		<description><![CDATA[<p>Cambridge, MA, October 16, 2012 – Hadapt, the only data analytics platform natively integrating SQL with Apache Hadoop, today announced version 2.0 of its Adaptive Analytical Platform. The new release features the industry’s first interactive applications on Hadoop, via Hadapt Interactive Query; &#8230; <a href="http://hadapt.com/hadapt-accelerates-hadoops-move-to-production-with-interactive-applications/">Continued</a></p><p>The post <a href="http://hadapt.com/hadapt-accelerates-hadoops-move-to-production-with-interactive-applications/">Hadapt Accelerates Hadoop’s Move to Production with Interactive Applications</a> appeared first on <a href="http://hadapt.com">Hadapt</a>.</p><img src="http://track.hubspot.com/__ptq.gif?a=219713&k=14&bu=http%3A%2F%2Fhadapt.com%2Fblog%2F&r=http%3A%2F%2Fhadapt.com%2Fhadapt-accelerates-hadoops-move-to-production-with-interactive-applications%2F&bvt=rss&p=wordpress" style="float:left;" xml:base="http://hadapt.com/feed/" width="1" height="1" border="0" align="right"/>]]></description>
			<content:encoded><![CDATA[<p><strong>Cambridge, MA, October 16, 2012 – </strong><a href="http://www.hadapt.com/">Hadapt</a>, the only data analytics platform natively integrating SQL with Apache Hadoop, today announced version 2.0 of its <a href="http://www.hadapt.com/product">Adaptive Analytical Platform</a>. The new release features the industry’s first interactive applications on Hadoop, via Hadapt Interactive Query; the Hadapt Development Kit<sup>TM</sup> (HDK) for custom analytics; and integration with <a href="http://www.tableausoftware.com/">Tableau Software</a>.</p>

<p>“There is significant demand in the market to leverage Hadoop for Big Data analytics,” said Merv Adrian, Research VP at Gartner. “Two key enablers must be addressed for the business analyst community: 1) interactive query capabilities coupled with continuous data ingestion, and 2) advanced analytics packaged as SQL functions that can be natively integrated with existing BI tools.”</p>

<p>Hadapt resolves both issues, accelerating the production deployment of Hadoop in the enterprise. Hadapt 2.0 empowers analysts to conduct investigative analytics on all of their data (structured, unstructured or semi-structured) in a single, unified platform, using standard business intelligence tools like Tableau. The new release also includes the HDK, allowing analysts to create advanced SQL analytic functions that can be used for campaign analysis, full text search, funnel analysis, sentiment analysis, pattern matching and predictive modeling.</p>

<p>“The integration of Tableau with Hadapt’s Interactive Query capabilities delivers access to advanced analytics on Hadoop via SQL at petabyte scale,” said Daniel Jewett, VP of Product Management at Tableau. “The combination of our products allows business analysts to bridge the gap to the full Hadoop analytic ecosystem.”</p>

<p>“The idea of interactive applications on Hadoop was one of the founding ideas behind the company, and today we’re making it a reality,” said Hadapt CEO Justin Borgman. “For a long time, there’s been this incredibly powerful tool, Hadoop, that was limited to the technically-savvy community. Now, it’s here for the masses: interactive, massive-scale data processing—on any type of data—using commodity hardware and a familiar SQL interface.”</p>

<p>Hadapt has been selected as a Startup Showcase finalist at the <a href="http://www.strataconf.com/stratany2012/?intcmp=il-strata-stny12-franchise-page">O’Reilly Strata Conference and Hadoop World</a> in New York City October 23<sup>rd</sup>-25<sup>th</sup>, where it will demonstrate the new release. Availability of Hadapt 2.0 is planned for early Q1 2013. For more information, visit <a href="http://www.hadapt.com/">www.hadapt.com</a>.</p>

<p><strong>About Hadapt<br />
</strong>Hadapt has developed the industry’s only Big Data analytic platform natively integrating SQL with Apache Hadoop. The unification of these traditionally segregated platforms enables customers to analyze all of their data (structured, semi-structured and unstructured) in a single platform—no connectors, complexities or rigid structure. The company is headquartered in Cambridge, MA.</p>

<p><strong>Media Contact<br />
</strong>Larry Bouchie<br />
<a href="mailto:larry.bouchie@hadapt.com">larry.bouchie@hadapt.com<br />
</a>(781) 620-0278</p>
<p>The post <a href="http://hadapt.com/hadapt-accelerates-hadoops-move-to-production-with-interactive-applications/">Hadapt Accelerates Hadoop’s Move to Production with Interactive Applications</a> appeared first on <a href="http://hadapt.com">Hadapt</a>.</p><img src="http://track.hubspot.com/__ptq.gif?a=219713&k=14&bu=http%3A%2F%2Fhadapt.com%2Fblog%2F&r=http%3A%2F%2Fhadapt.com%2Fhadapt-accelerates-hadoops-move-to-production-with-interactive-applications%2F&bvt=rss&p=wordpress" style="float:left;" xml:base="http://hadapt.com/feed/" width="1" height="1" border="0" align="right"/>]]></content:encoded>
			<wfw:commentRss>http://hadapt.com/hadapt-accelerates-hadoops-move-to-production-with-interactive-applications/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Invisible loading: A new paradigm for loading from unstructured to structured storage</title>
		<link>http://hadapt.com/invisible-loading-a-new-paradigm-for-loading-from-unstructured-to-structured-storage/</link>
		<comments>http://hadapt.com/invisible-loading-a-new-paradigm-for-loading-from-unstructured-to-structured-storage/#comments</comments>
		<pubDate>Wed, 05 Sep 2012 23:00:40 +0000</pubDate>
		<dc:creator>Daniel Abadi</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://hadapt.com/?p=1104</guid>
		<description><![CDATA[<p>In my last post on this blog, I outlined the flaws of the ubiquitous Hadoop-DBMS “connector” technology that unnecessarily links together two different systems that have an extremely similar architecture. In this post, I will discuss a new loading technology we &#8230; <a href="http://hadapt.com/invisible-loading-a-new-paradigm-for-loading-from-unstructured-to-structured-storage/">Continued</a></p><p>The post <a href="http://hadapt.com/invisible-loading-a-new-paradigm-for-loading-from-unstructured-to-structured-storage/">Invisible loading: A new paradigm for loading from unstructured to structured storage</a> appeared first on <a href="http://hadapt.com">Hadapt</a>.</p><img src="http://track.hubspot.com/__ptq.gif?a=219713&k=14&bu=http%3A%2F%2Fhadapt.com%2Fblog%2F&r=http%3A%2F%2Fhadapt.com%2Finvisible-loading-a-new-paradigm-for-loading-from-unstructured-to-structured-storage%2F&bvt=rss&p=wordpress" style="float:left;" xml:base="http://hadapt.com/feed/" width="1" height="1" border="0" align="right"/>]]></description>
			<content:encoded><![CDATA[<p>In my <a href="http://hadapt.com/why-database-to-hadoop-connectors-are-flawed">last post on this blog</a>, I outlined the flaws of the ubiquitous Hadoop-DBMS “connector” technology that unnecessarily links together two different systems that have an extremely similar architecture.</p>
<p>In this post, I will discuss a new loading technology we call <strong>invisible loading</strong> that addresses many other pain points that people encounter when using a big data solution that involves structured and unstructured components. This (Hadapt-owned) patent-pending technology was presented by the lead author, Azza Abouzied, last week, and we are finally releasing the <a href="http://cs-www.cs.yale.edu/homes/dna/papers/invisibleloading.pdf">academic paper</a> behind this technology today as part of this blog post.</p>
<p>The basic idea is the following: an unstructured data store such as HDFS is great for raw data sets since it does not require the raw data to conform to any predefined schema, and can handle large amounts of data at extremely low cost. This raw data can be scrubbed, cleaned, and refined via MapReduce jobs, Pig scripts, and other useful tools in the Hadoop ecosystem. Over time, this raw data becomes increasingly structured, at which point the “transformation” part of its lifetime comes to an end and the “query” part of its lifetime commences, becoming repeatedly accessed and queried via tools and languages like Hadapt, Hive and SQL. This common data processing workflow is illustrated in the diagram below.</p>
<p style="text-align: center;"><img class="aligncenter size-full wp-image-1447" title="1" src="http://hadapt.com/assets/12.png" alt="" width="600" height="254" /></p>

<p>In existing big data solutions, this transition between the transformation and query phases of the data lifetime needs to be detected by a human, who will then initiate a load from one shared-nothing scalable parallel analysis platform (Hadoop) to another shared-nothing scalable parallel analysis platform (an MPP database system). In addition to the design flaws of moving data between architecturally similar data processing platforms (as discussed in my previous post), this workflow causes the following pain points:</p>
<ul>
<li>Design expertise and efforts required in separating transformation and query phases</li>
<li>ETL expertise and efforts needed in moving data from unstructured to structured store</li>
<li>Performance overhead involved in executing the aforementioned ETL jobs</li>
</ul>
<p>However, it is possible to <strong>automatically detect when data becomes structured</strong> enough to fit in a structured store. For example, there exist tools like <a href="http://www.padsproj.org/papers/popl08.pdf">PADS</a> and <a href="http://www.cloudera.com/blog/2011/07/recordbreaker-automatic-structure-for-your-text-formatted-data/">RecordBreaker</a> that can discover structure in datasets. Alternatively, it is possible to use static code analysis to detect when MapReduce jobs issued over data in HDFS assumes a certain structure (e.g., if there is parsing code at the beginning of the Map phase of a MapReduce job). Furthermore, if the user is using Pig parsing libraries or has already created a schema for Hadapt, Hive or SQL, then it is trivial to discover the structure in the data.</p>
<p>Once the system has discovered that the data has a certain structure to it, there can be huge<strong>performance benefits</strong> to storing that data in structured format that can leverage knowledge of the repeated structure in the data. This structured data store need not be an MPP database that sits across the network on a different set of servers; rather, data can simply be structured into relational storage on the same physical servers as the raw data.</p>
<p>The key contribution of our research is that this shifting of data from unstructured data stores such as HDFS to structured storage sitting on the same physical machine can happen <strong>invisibly and incrementally</strong>. The data scientist (or data analyst, or BI client, etc.) can access the data via MapReduce jobs, Pig scripts, or any other standard interface to Hadoop. These jobs read data from HDFS just like normal, and return results just like normal. However, since the data needs to be read anyway in order to process the job/query, a subset of the data that is read is moved into structured storage. Future jobs/queries over the same input data set <strong>automatically merge the data</strong> that is still in HDFS with the data that is in structured storage (with reads from structured storage being much faster than reads from HDFS). The user/client is completely unaware of the invisible data movement &#8212; all that is observable is a <strong>steady improvement in query performance</strong> as more and more data is read from structured storage.</p>
<p>What’s cool about this process is that the incremental nature of the data shifting allows for the human to be eliminated from the loading process. If the data is still being continuously transformed and refined, then very little progress will be made in moving data to structured storage. However, once the data becomes stable, and is continuously queried, incremental progress will be made for loading the data into faster, structured storage, until the entire data set ends up there. Meanwhile, the cost of the load is nearly invisible, since the reading of the data was required anyway to process the early queries.</p>

<h2>The Hadapt Advantage</h2>
<p>Technologies such as invisible loading will enable Hadapt to function as a data refinery as well as a full-service Big Data analytical platform. The raw data starts in HDFS (Hadapt is integrated with Hadoop and leverages various Hadoop components), and as it becomes refined and structured, it automatically gets moved into optimized structured storage for fast querying and structured analysis (via languages such as SQL). This process happens automatically, without human intervention, and also as a side effect of standard interactions with the Hadapt/Hadoop platform, so that the client is unable to detect this data movement. Furthermore, it occurs within the same physical hardware, so that the network is not burdened by this loading process (otherwise the loading would certainly not be invisible).</p>
<p>While the invisible loading technology can potentially be adapted to other big data systems, the<a href="/why-database-to-hadoop-connectors-are-flawed/"> unified (connectorless) analytics</a> nature of Hadapt makes it the only platform in which the full power of invisible loading can be unleashed.</p>
<p>There are obviously a lot of important details to make all of this work. We encourage readers of this blog to read both the <a href="http://cs-www.cs.yale.edu/homes/dna/papers/invisibleloading.pdf">original research paper</a> and the slides that Azza used a couple of days ago to<a href="http://www.slideshare.net/abadid/invisible-loading"> present this work</a>. We also expect to follow up this post with additional details in the future.</p>
<p>The post <a href="http://hadapt.com/invisible-loading-a-new-paradigm-for-loading-from-unstructured-to-structured-storage/">Invisible loading: A new paradigm for loading from unstructured to structured storage</a> appeared first on <a href="http://hadapt.com">Hadapt</a>.</p><img src="http://track.hubspot.com/__ptq.gif?a=219713&k=14&bu=http%3A%2F%2Fhadapt.com%2Fblog%2F&r=http%3A%2F%2Fhadapt.com%2Finvisible-loading-a-new-paradigm-for-loading-from-unstructured-to-structured-storage%2F&bvt=rss&p=wordpress" style="float:left;" xml:base="http://hadapt.com/feed/" width="1" height="1" border="0" align="right"/>]]></content:encoded>
			<wfw:commentRss>http://hadapt.com/invisible-loading-a-new-paradigm-for-loading-from-unstructured-to-structured-storage/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Data Scientists – The Opportunity, the Skillset and the Mission</title>
		<link>http://hadapt.com/data-scientists-the-opportunity-the-skillset-and-the-mission/</link>
		<comments>http://hadapt.com/data-scientists-the-opportunity-the-skillset-and-the-mission/#comments</comments>
		<pubDate>Tue, 31 Jul 2012 22:55:36 +0000</pubDate>
		<dc:creator>Mingsheng Hong</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://hadapt.com/?p=1099</guid>
		<description><![CDATA[<p>In light of the Boston Big Data initiatives, various data analytics meetups and seminars are springing up.  I had the honor to be the inaugural speaker at an exciting data science seminar series, a group with nearly 400 members. The talk received &#8230; <a href="http://hadapt.com/data-scientists-the-opportunity-the-skillset-and-the-mission/">Continued</a></p><p>The post <a href="http://hadapt.com/data-scientists-the-opportunity-the-skillset-and-the-mission/">Data Scientists – The Opportunity, the Skillset and the Mission</a> appeared first on <a href="http://hadapt.com">Hadapt</a>.</p><img src="http://track.hubspot.com/__ptq.gif?a=219713&k=14&bu=http%3A%2F%2Fhadapt.com%2Fblog%2F&r=http%3A%2F%2Fhadapt.com%2Fdata-scientists-the-opportunity-the-skillset-and-the-mission%2F&bvt=rss&p=wordpress" style="float:left;" xml:base="http://hadapt.com/feed/" width="1" height="1" border="0" align="right"/>]]></description>
			<content:encoded><![CDATA[<p>In light of the <a href="http://www.mass.gov/governor/pressoffice/pressreleases/2012/2012530-governor-announces-big-data-initiative.html">Boston Big Data initiatives</a>, various data analytics meetups and seminars are springing up.  I had the honor to be the inaugural speaker at an exciting <a href="http://www.meetup.com/The-Data-Scientist/">data science seminar series</a>, a group with nearly 400 members. The talk received excellent <a href="http://storify.com/MarketMeSuite/data-scientist-seminar-series-kicks-off-dsss1">live twitter coverage</a> from <a href="http://marketmesuite.com/">MarketMeSuite</a>, and was written up in Gil Press’s <a href="http://smartdatacollective.com/node/56046">blog post</a>.</p>
<p>In the spirits of data analytics and visualization, I presented a summary of my background, by visualizing my <a href="http://www.linkedin.com/in/mingshenghong">LinkedIn profile</a> in word cloud format.</p>

<p style="text-align: center;"><img class="aligncenter size-full wp-image-1434" title="1" src="http://hadapt.com/assets/11.png" alt="" width="731" height="417" /></p>
<p> Here are a few fun quotes on data, respectively by Tim O’reilly and Mark Pincus. These quotes paint a vivid picture of the crucial role that data is playing in the high tech industry. That’s what data science is about too –<strong>not just cold number crunching, but </strong><a href="http://www.ted.com/talks/hans_rosling_shows_the_best_stats_you_ve_ever_seen.html"><strong>good story telling</strong></a>.</p>
<table>
<tbody>
<tr>
<td>
<p><div id="attachment_1435" class="wp-caption alignleft" style="width: 110px"><img class=" wp-image-1435" title="2" src="http://hadapt.com/assets/2.jpeg" alt="" width="100" height="150" /><p class="wp-caption-text">Tim O&#8217;Reilly</p></div></td>
<td>
<blockquote><p>&#8220;Data is the next <strong>Intel Inside</strong>.&#8221;</p></blockquote>
</td>
</tr>
</tbody>
</table>

<table>
<tbody>
<tr>
<td>
<blockquote><p>&#8220;Data is the <strong>Operating System</strong> of Zynga.&#8221;</p></blockquote>
</td>
<td>
<p><div id="attachment_1436" class="wp-caption alignright" style="width: 110px"><img class="size-full wp-image-1436" title="3" src="http://hadapt.com/assets/33.png" alt="" width="100" height="117" /><p class="wp-caption-text">Mark Pincus</p></div></td>
</tr>
</tbody>
</table>
<p>The above quotes about the Data industry lead us to a <a href="http://www.nytimes.com/2009/08/06/technology/06stats.html?_r=1">more concrete statement</a> from Prof. Hal Varian. He said statisticians are going <em>waaay up </em>in the social hierarchy, or at least that’s my interpretation. To drive home how unrealistic and ludicrous this claim potentially sounds, he said “I’m not kidding.”</p>
<table>
<tbody>
<tr>
<td>
<p><div id="attachment_1437" class="wp-caption alignleft" style="width: 110px"><img class="size-full wp-image-1437" title="4" src="http://hadapt.com/assets/41.png" alt="" width="100" height="150" /><p class="wp-caption-text">Professor Hal Varian, Chief Economist at Google</p></div></td>
<td>
<blockquote><p>&#8220;I keep saying that the sexy job in the next 10 years will be statisticians. And I’m not kidding.&#8221;</p></blockquote>
</td>
</tr>
</tbody>
</table>
<p>I’m not an economist, but professor, I beg to differ here, and I’ve got data to back me up. <img src='http://hadapt.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  Take a look – Over the last couple years, we have these many job postings on statisticians. Since Hal made the statement in Jan ’09, I don’t see any statistical significance in the change of the trends.</p>
<p style="text-align: center;"><img class="aligncenter size-full wp-image-1438" title="5" src="http://hadapt.com/assets/5.png" alt="" width="474" height="264" /><img class="aligncenter size-full wp-image-1439" title="6" src="http://hadapt.com/assets/6.png" alt="" width="473" height="264" /></p>
<p>On the other hand though, the demand for data scientists is just exploding! Here we derive an<strong>actionable insight</strong>: Update your LinkedIn profile by adding “data science” to your skill set. If possible, work with your employer to change your job title to data scientist. “Dress for the job you want.” – Perception is reality!! <img src='http://hadapt.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>Of course, Prof. Varian is actually right. Here is something else he said. He is apparently reusing the term statistician to define a new species of talents – the data scientists.</p>
<blockquote><p>The new breed of statisticians … use <strong>powerful computers</strong> and<strong>sophisticated mathematical models</strong> to hunt for <strong>meaningful patterns and insights</strong> in vast troves of data.</p></blockquote>
<p>So what type of skill sets does a data scientist need? From my research and experiences, here are the <a href="http://www.dataists.com/2010/09/the-data-science-venn-diagram/">hard skills</a> and <a href="http://www-01.ibm.com/software/data/infosphere/data-scientist/">soft skills</a>. In particular, communication is a crucial skill for a data scientist, because <a href="http://sethgodin.typepad.com/seths_blog/2012/06/a-lesson-from-a-great-architect.html">just like an architect</a>, selling the ideas and analytic insights to the customer or management is crucial to making a real impact.</p>
<p style="text-align: center;"><img class="aligncenter size-full wp-image-1440" title="7" src="http://hadapt.com/assets/7.png" alt="" width="575" height="243" /></p>
<p>There is currently <a href="http://www.mckinsey.com/~/media/McKinsey/dotcom/Insights%20and%20pubs/MGI/Research/Technology%20and%20Innovation/Big%20Data/MGI_big_data_exec_summary.ashx">huge demand</a> and <a href="http://mashable.com/2012/01/13/career-of-the-future-data-scientist-infographic/">significant shortage</a> in big data and data science talents. You don’t need to be Google’s chief economist to understand the supply and demand principle at work. Go ahead and develop the above skills as you can! Here are some resources I would recommend:</p>
<ul>
<li><a href="http://www.hackreduce.org/">hack/reduce</a>: Boston’s Big Data Hackspace. “Code Big or Go Home!”</li>
<li>Berkeley <a href="http://datascienc.es/">Data Science course</a></li>
<li>Stanford <a href="https://www.coursera.org/course/ml">Machine Learning course</a></li>
</ul>
<p>Data scientists help their companies incubate new analytic use cases to monetize from data. However, personally I believe the shortage on data scientists will last for a while, and only a small percentage of companies will be able to build up a team of data scientists. What can we do for the rest of the companies?</p>
<p>I believe better tools need to be built to make Big Data accessible to companies short on data scientists, and to people less tech savvy. As a data scientist working on building one such tool at <a href="http://www.hadapt.com/">Hadapt</a>, I am focused on defining the product roadmap with a data-driven approach, so that what we build is not just helping one customer at a time, but empowering a wave of companies and users to monetize from data. As such, the Data Scientist is the new Product Manager.</p>
<p style="text-align: center;"><img class="aligncenter size-full wp-image-1441" title="8" src="http://hadapt.com/assets/8.png" alt="" width="300" height="239" /></p>
<p>The slide deck of my talk can be <a href="/assets/data-scientist-seminar-series.pdf">downloaded here</a>. Look forward to your thoughts!</p>
<p>The post <a href="http://hadapt.com/data-scientists-the-opportunity-the-skillset-and-the-mission/">Data Scientists – The Opportunity, the Skillset and the Mission</a> appeared first on <a href="http://hadapt.com">Hadapt</a>.</p><img src="http://track.hubspot.com/__ptq.gif?a=219713&k=14&bu=http%3A%2F%2Fhadapt.com%2Fblog%2F&r=http%3A%2F%2Fhadapt.com%2Fdata-scientists-the-opportunity-the-skillset-and-the-mission%2F&bvt=rss&p=wordpress" style="float:left;" xml:base="http://hadapt.com/feed/" width="1" height="1" border="0" align="right"/>]]></content:encoded>
			<wfw:commentRss>http://hadapt.com/data-scientists-the-opportunity-the-skillset-and-the-mission/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Why Database-to-Hadoop Connectors are Fundamentally Flawed and Entirely Unnecessary</title>
		<link>http://hadapt.com/why-database-to-hadoop-connectors-are-flawed/</link>
		<comments>http://hadapt.com/why-database-to-hadoop-connectors-are-flawed/#comments</comments>
		<pubDate>Tue, 24 Jul 2012 22:53:49 +0000</pubDate>
		<dc:creator>Daniel Abadi</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://hadapt.com/?p=1093</guid>
		<description><![CDATA[<p>The Past Before Hadoop, the Big Data market was dominated by a large variety of proprietary relational databases. These relational databases were focused on achieving high performance query processing of structured data. Although most relational databases do not have unlimited &#8230; <a href="http://hadapt.com/why-database-to-hadoop-connectors-are-flawed/">Continued</a></p><p>The post <a href="http://hadapt.com/why-database-to-hadoop-connectors-are-flawed/">Why Database-to-Hadoop Connectors are Fundamentally Flawed and Entirely Unnecessary</a> appeared first on <a href="http://hadapt.com">Hadapt</a>.</p><img src="http://track.hubspot.com/__ptq.gif?a=219713&k=14&bu=http%3A%2F%2Fhadapt.com%2Fblog%2F&r=http%3A%2F%2Fhadapt.com%2Fwhy-database-to-hadoop-connectors-are-flawed%2F&bvt=rss&p=wordpress" style="float:left;" xml:base="http://hadapt.com/feed/" width="1" height="1" border="0" align="right"/>]]></description>
			<content:encoded><![CDATA[<h3>The Past</h3>
<p><img class="alignright  wp-image-1424" src="http://hadapt.com/assets/1.png" alt="" width="286" height="190" />Before Hadoop, the Big Data market was dominated by a large variety of proprietary relational databases. These relational databases were focused on achieving high performance query processing of structured data. Although most relational databases do not have unlimited scalability (even the parallel “MPP” relational databases), they were still scalable enough to control the vast majority of the Big Data market. However, starting around 2008, Hadoop began to disrupt this market in three major ways:</p>

<ol>
<li>Hadoop uses a more flexible programming framework in order to enable high performance query processing, even for unstructured data.</li>
<li>Hadoop is open source, whereas every other parallel database system used proprietary code.</li>
<li>Hadoop is more scalable than even the most scalable relational database system.</li>
</ol>

<p>Fearful of the emergence of Hadoop, the initial reaction of relational database vendors was to (correctly) blast the technical deficiencies of Hadoop in an attempt to stymie Hadoop&#8217;s increasing popularity. Most of the criticisms revolved around the problem that Hadoop is not optimized for data processing of traditional, structured, relational data; and that it was “batch-oriented” (as opposed to real-time).</p>
<h3>The Present</h3>
<p>As Hadoop continued its ascent within the enterprise, the relational vendors had no choice but to resort to a coexistence strategy &#8212; pigeonholing Hadoop into the role of ETL (extract, transform, load) and processing of unstructured data, leaving relational databases to process all structured data.</p>
<p>At the same time, new start-ups were emerging to commercialize Hadoop &#8212; mostly by adding services, support, and management tools around the Hadoop open source core. From a pragmatic standpoint (in order to accelerate Hadoop adoption in the enterprise and receive the services and support dollars that come with such enterprise adoption), these start-ups were willing to allow Hadoop to be pigeonholed by the relational database vendors, thereby enabling these small Hadoop vendors to partner with the much bigger relational database providers. Despite the downside of only yielding Hadoop adoption in a small subset of possible enterprise use-cases, these partnerships led to immediate revenue for these small Hadoop start-ups and increased valuation in follow-up venture capital rounds.</p>
<p>Consequently, due to entirely pragmatic thinking and short-term motivations, every Big Data vendor that matters (both in the relational database space and in the Hadoop space) has advocated for a two system approach to processing Big Data &#8212; Hadoop for the unstructured data and relational databases for the structured data, with a connector between them, shipping data back and forth over a network connection.</p>
<p style="text-align: center;"><img class="aligncenter size-full wp-image-1425" title="" src="http://hadapt.com/assets/2.png" alt="" width="563" height="219" /></p>
<p style="text-align: center;"><strong> TRADITIONAL STRATEGY</strong></p>
<p style="text-align: left;">Every single one of these vendors is incorrect.  This is a poor architectural vision for the future of Big Data processing.</p>
<h3 style="text-align: left;">The Future</h3>
<p>Many people don’t realize that Hadoop and parallel relational databases have an extremely similar design. <strong>Both are capable</strong> of storing large data sets by breaking the data into pieces and storing them on multiple independent (“shared-nothing”) machines in a cluster. <strong>Both scale</strong> processing over these large data sets by parallelizing the processing of the data over these independent machines. <strong>Both do</strong> as much independent processing as possible across individual partitions of data, in order to reduce the amount of data that must be exchanged between machines. <strong>Both store</strong> data redundantly in order to increase fault tolerance.  The algorithms for scaling operations like selecting data, projecting data, grouping data, aggregating data, sorting data, and even joining data are the same. If you squint, <strong>the basic data processing technology of Hadoop and parallel database systems are identical.</strong></p>
<p>There is absolutely no <strong>technical reason</strong> why there needs to be two separate systems doing the exact same type of parallel processing. While it is true that, today, Hadoop is lacking some of the important features that are available in relational database systems, this gap is closing over time. And while it is true that primary storage in Hadoop (HDFS) is a file system that is optimized for unstructured data, and the primary storage of parallel database systems is a set of relational tables that are optimized for structured data, <strong>there is no reason</strong> why the file storage and relational storage can’t sit side by side on the same physical machines and even on the same disk drives.</p>
<p><strong>There is no reason</strong> why you need two different systems, sitting in two different clusters, that are architected in a fundamentally similar way. <strong>There’s no reason</strong> to pay the increased management costs of having two different systems built by two different vendors. <strong>There’s no reason </strong>to pay the organizational costs that are incurred by the data silos that are created through having multiple systems. And <strong>there’s certainly no reason</strong> to pay the networking costs of getting a decent sized communication pipe between them.</p>
<p>Connectors between databases and Hadoop are entirely the wrong way to think about scalable data processing of enterprise data. Even the upgraded connectors that have been announced in recent months by database and Hadoop vendors (e.g. making a connector “transactional”, or leveraging projects like HCatalog to make the connector more intelligent at the end points) are counterproductive and only serve to propagate a data processing design that is a fundamentally poor long term strategy for an organization. A super-charged connector is still a connector, and that’s the wrong architectural choice moving forward.</p>
<p style="text-align: center;"><img class="aligncenter  wp-image-1429" title="3" src="http://hadapt.com/assets/32.png" alt="" width="446" height="315" /></p>
<p><strong>This is What Happens to a Bridge with Heavy Traffic</strong></p>
<p>In the future, it is clear that a single Hadoop installation will be enough to process both structured and unstructured data in the same system. The only question is how long it will be before the Hadoop vendors will be willing to abandon their short-term partnership strategy with relational database vendors and attack them head-on. It likely won&#8217;t be very long.</p>
<p style="text-align: center;"><img class="aligncenter  wp-image-1430" src="http://hadapt.com/assets/4.png" alt="" width="513" height="424" /></p>
<p style="text-align: center;"><strong>OPTIMAL STRATEGY</strong></p>
<p>The post <a href="http://hadapt.com/why-database-to-hadoop-connectors-are-flawed/">Why Database-to-Hadoop Connectors are Fundamentally Flawed and Entirely Unnecessary</a> appeared first on <a href="http://hadapt.com">Hadapt</a>.</p><img src="http://track.hubspot.com/__ptq.gif?a=219713&k=14&bu=http%3A%2F%2Fhadapt.com%2Fblog%2F&r=http%3A%2F%2Fhadapt.com%2Fwhy-database-to-hadoop-connectors-are-flawed%2F&bvt=rss&p=wordpress" style="float:left;" xml:base="http://hadapt.com/feed/" width="1" height="1" border="0" align="right"/>]]></content:encoded>
			<wfw:commentRss>http://hadapt.com/why-database-to-hadoop-connectors-are-flawed/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
