Quantcast
Channel: StackExchange Replication Questions
Viewing all articles
Browse latest Browse all 17268

loading a very large table without a numaric ID from MySQL to S3

$
0
0

I'm trying to pump (with Sqoop) a large table (500GB in size with around 200M rows) in MYSQL to S3. However this table doesn't have a Key column which is numeric. It has a combined primary key with 3 columns. I observed that sqoop cannot chunk the dataset evenly as the ID are not evenly distributed between min and max values. Rage queries in sqoop don't work well because that column is not indexed.

Is there a better way to do this with sqoop or any other technology?

P.S. I'm trying to output data as AVRO files


Viewing all articles
Browse latest Browse all 17268

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>