Fastest Way to Import Big CSV to Redis

Sharing is caring...Share on FacebookShare on Google+Tweet about this on TwitterShare on RedditEmail this to someone

The Problem – Insert millions of keys in Redis

Image you need to load a HUGE (millions of lines) CSV file in a Redis cluster to be accessed by your web server(s). You might quickly think of writing a quick code in your favorite programming language to read the file and insert them in redis. Only to notice it’s taking more than you’ve expected and this is mainly due to the fact that you’re sequentially looping over every line, open a redis connection, run a redis put command, wait for Redis to acknowledge it then close the connection only to open it again in the next iteration.

The Solution – Use redis-cli –pipe

If you can using simple bash scripting skills change the CSV into appropriate Redis commands, you’re done : )

CSV file example:

This could be imported into Redis by converting each line (skipping the first one) to the matching command:

Using the following command:

Quick Benchmark:

on AWS EC2 – C4.2xlarge & Redis running on M3.Large

  • PHP (simple loop and put implimentation): 7 mins
  • redis-cli (pipeline): 9 secs

Result: more than 40X Times Faster!!!

 

Notes

  • If you started getting a similar error to (ERR unknown command ‘ET’):
    • Then you need to to add “| unix2dos” before “| redis-cli –pipe”
  • you might use the PHP SDK’s in a similar way to pipeline, view here.

See more about this here

Comment ( 1 )

  1. / Replyjames newton
    Great point! My question / concern is this: Can you do this safely with data posted to a web server from a user? E.g. allowing uploads of text files or other simple CSV data.

Leave a reply

Your email address will not be published.

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>