Erasure Coding in HDFS

The folks at edureka! have an excellent post on some of the new features in Hadoop 3.
One of the features that really caught our eye is Erasure Encoding in HDFS. This is bringing RAID type architecture to HDFS to save a ton of storage. It’s like going from RAID 1 full mirroring to RAID 6. Some concerning issues would be stability, performance and rebuild times.

Here are some limitations so far in using this new technology from the Hortonworks documentation site:

If this new feature goes full GA and is stable, it could save our customers a lot of money in storage costs!

Check out the blog post here.

Here is a great in-depth blog post about Erasure Coding back in 2015 from Cloudera.

How AWS paid for our trip to Hawaii

There is a great article over on LinkedIn about a very promising looking product to “bring back your cloud" into your own data center or colocation facility. Folks are getting sticker shock from their AWS bill and although some individuals are benefiting from it, (trips to Hawaii using credit card points..) most are trying to figure out how to reduce their spend and get to a more predictable hosting infrastructure.

We will definitely be checking out Cloudistics and their software. Maybe Bit Refinery can also reduce our “VMware tax" (love that..) as they call it.

Go check it out: How AWS paid for our trip to Hawaii
A customer considering moving some of their infrastructure to us sent this screenshot of their current AWS billing. Crazy stuff considering they were spending about 26k a month when they had their own data center space:

SSD Hadoop Nodes

Bit Refinery now offers SSD Big Data nodes. Our base server provides 128GB of Ram, Dual 8 core Intel CPUs, 10Gb networking and 6ea. 1TB SSD drives providing a whopping 240,000 Read IOPS for only $800/mo.

For the fun of it, let’s take a similar server at AWS. The r4.4xlarge will work. That is near the specs that we provide. Adding 6ea 1TB Provisioned SSDs with only 120,000 IOPS puts us at $10,281.54 a MONTH.

This means that one node over a year will cost you $123,468 vs. $9,600 with Bit Refinery. It’s an obvious choice…and that isn’t even taking into consideration our SSDs are twice as fast on reads.

Contact us for more information.

AWS Data Jail

Amazon released their Q3 2016 numbers the other day and their dominance is unreal. Never before have we seen a company dominate so many industries at the same time. With AWS revenue of $3.2 billion, or nearly 10 percent of Amazon’s total Q3 revenue, it’s no wonder companies are blindly flocking to “go to the cloud".

The problem is AWS is great for prototyping and elastic workloads. When it comes to servers that need to be up 24/7, the pricing and nickel-and-diming which is the core of their business model starts to really add up. When you accrue enough data in S3, you are forced to only interact with services that are in the AWS ecosystem. That is great for AWS and their vendors. (AWS takes 20% of the revenue of every single app much like the App Store from Apple)

Before you head down the AWS route, break out a spreadsheet and try to estimate your monthly bill using their calculator. Don’t forget to add in S3 GETS, outgoing bandwidth, per hour services,etc..

We predict a 2-3 year swinging door. Companies will be forced by their alarming AWS bills to look to more predictable infrastructure solutions like Bit Refinery. Your data is yours and you should NEVER be trapped trying to get it out and use it somewhere else.

-Bit Refinery

Using Big Data for Good, Not Evil…

In a day where some people think of Big Data is used by large companies to mine customer buying habits and figure out how to market them better, there are other folks using Big Data in other ways.

A great example is this article from Ben Wellington. He took parking ticket data and joined it up with Google Street View to show lots of these tickets were given out illegally. The person getting the ticket was actually parked LEGALLY.

Providing this level of data to anyone that wants to analyze it is a great thing. With the new tools to capture and analyze large data sets now available (mostly for free), we will continue to see articles like this in the future.

-Bit Refinery