FlowingData Forums » Data Visualization

MachetEC2: Open Visualization / Big Data Toolkit on Amazon EC2

Started 3 years ago by mrflip / 6 posts

  1. Hi,
    We're assembling MachetEC2, a free Amazon AWS EC2 instance that comes out of the box with a full suite of data analysis and visualization tools. (Obviously the compute time is on your dime, but the base instance is there for you to grab, for free, and get to hackin')

    What tools do you want to have in there? We're going to push this up in the next few days -- look for an announcement on the infochimps blog, and we'll reply here too. If you'd like to help, set up something you find essential and tweet/email us to 'pull' from your instance. (You'll have to use a credit card to get an instance, but it's so cheap it'll only cost a buck or two; and once it's set up we'll pull it onto our account where everyone can get it.) If there's already something like this floating around, please let us know and we'll work with them instead.

    We'll initially post up some subset of:

    * Ruby, Python, Erlang, R
    * MySQL, PostgreSQL
    * AllegroGraph, CouchDB
    * Hadoop, Hive, Pig
    * Cytoscape, Gruff
    * Processing, Prefuse/Flare, Modest Maps
    * NLTK, SciPy
    ...
    * YOUR SUGGESTION HERE

    This will fit in with the things we're helping to add to the AWS Public Data Sets. We want you to be able to use infochimps to find something or pull down the Public Data Set of your choice; to load that dataset next to your MachetEC2 instance clone, and be whailing on data from go.

    At some later point we can specialize (MachetEC2-viz, MachetEC2-ML for machine learning folks, MachetEC2-ling for linguistics, etc), so feel free to suggest things of (somewhat) narrow focus, and even more free as I said to set it up and ping us. Free/Open software only, of cours.

  2. Wow, 15 minutes later and already some great additions so far from twitter.

    Including:

    * @hackingdata: prefuse/flare

    * @peteskomoroch: @infochimps I did something similar a while back, might want to add ipython, boto. some libs I considered: http://tinyurl.com/ckl2zr (<-- whoa. Awesome.)

    * @dwf: Mayavi2 and its dependency, VTK: Great for 3D viz stuff. PyMC, definitely. matplotlib. Maybe Boost?

    * @grantmichaels: erlang

    * @ogrisel: happy or disco for mapreduce, numpy, scipy, pylab, boto, mdp, lxml, liblinear, cython, gcc, torch5

    Already more than our finite monkeys can roll out, but with this kind of interest we'll get the ball rolling (or get behind @peteskomoroch's ball and help push) and watch what happens from there.

  3. So in layman's terms... MachetEC2 would be used to make data storage and retrieval easier with Amazon EC2?

  4. Amazon's EC2 allows users to instantiate a virtual computer with a pre-installed operating system, software packages, and up to 1 TB of data pre-loaded on disk, ready to work with, from a pre-defined image (an "Amazon Machine Image", or AMI).

    MachetEC2 is an effort by a group of Infochimps to create an AMI for data processing, analysis, and visualization. If you create an instance of MachetEC2, you'll be have an environment with tools designed for working with data ready to go. All you'll need to do is load in some data and get to work!

  5. Ah, I get it now

  6. machetEC2 is released! Check out our blog post (soon to be wiki) for more details. The id of the image is ami-29ef0840. Launch it and try it out!

  7. Hello, Very Nice forum!
    Java for Nokia


Reply

You must log in to post.

About this Topic