By Camuel Gilyadov, on March 1st, 2012
This is a post by Constantine Peresypkin and David Gruzman. Lately we were working on integrating Hadoop with OpenStack Swift. Hadoop doesn’t need an introduction neither does OpenStack. Swift is an object-storage system and the technology behind RackSpace cloud-files (and quite a few others like Korea Telecom object storage, Internap and etc…) Before we go . . . → Read More: Apache Hadoop over OpenStack Swift
By Camuel Gilyadov, on February 11th, 2012
I’v been pitched by a lot of entrepreneurs trying to make a better-than-original “tooling” for a proprietary cloud, particularly for AWS. Ain’t the attempt futile from the beginning? Amazon is smart, innovative and working hard to make its cloud offering comprehensive and has much larger arsenal to overdo anyone who dare to compete on their own turf. . . . → Read More: Futility of “tooling” a proprietary cloud.
By Camuel Gilyadov, on February 7th, 2012
I wasn’t blogged for whole 2011 year… I’m not dead, quite on contrary, we were pretty active with OpenDremel project in 2011. First, we are renaming it to Dazo to avoid using a trademarked name and second, we did a good job implementing a secure generic execution engine and integrating it into OpenStack Swift. It also . . . → Read More: OpenDremel update and Dremel vs. Tenzing
By Camuel Gilyadov, on January 17th, 2011
Some examples of upcoming hardware renaissance era:
1. Virtually all server vendors are pitching modularized data centers by now. MDC are boxes resembling shipping containers accommodating complete vritualized data-center inside. With MDC one just connects power, network and chilled water and gets access to the cloud in the box. Most MDC are good to . . . → Read More: Upcoming hardware renaissance era: part #2.
By Camuel Gilyadov, on November 17th, 2010
INTRO I cannot count number of times I heard that cloud computing means innovation stagnation in the proprietary hardware business and that with cloud computing, hardware doesn’t matter anymore and will succumb sooner or later into boring razor-thin-margins oligopolistic commodity industry.
GAME OVER FOR FAT MARGINS IN PROPRIETARY HARDWARE? Why folks think like that? . . . → Read More: Emerging Proprietary Hardware Renaissance
By Camuel Gilyadov, on October 17th, 2010
It seems the recent craze about statistician being a profession of choice in the future gains steam. In future where we will be surrounded by quality BigData, capable computers and bug-free open source software including OpenDremel. Well the last one I made up… but the rest seems to be the current situation. Acknowledging this . . . → Read More: Two Envelopes Problem: Am I just dumb?
By Camuel Gilyadov, on October 13th, 2010
1. SSD is NOT synonymous for flash memory.
First of all let’s settle on terms. SSD is best described as a concept of using semiconductor memory as disk. There is two common cases: DRAM-as-disk and flash-as-disk. And flash-memory is a semiconductor technology pretty similar to DRAM, just with slightly different set of trade-offs made.
. . . → Read More: Debunking common misconceptions in SSD, particularly for analytics
By Camuel Gilyadov, on October 12th, 2010
Here is my early thoughts after quickly looking into Google Percolator and skimming the paper .
Major take-away: massive transactional mutating of tens-petabyte-scale dataset on thousands-node cluster is possible!
MapReduce is still useful for distributed sorts of big-data and few other things, nevertheless it’s “karma” has suffered a blow. Beforehand you could end any MapReduce dispute by . . . → Read More: Google Percolator: MapReduce Demise?
By Camuel Gilyadov, on October 11th, 2010
According to this excellent and comprehensive research with some kernel hacking ~x33 speedup (compared to single core) is possible. For example PostgreSQL running on 48 cores gives ~x4 out of the box and after kernel/postgreSQL patches are applied it grows to ~x33. Assuming IO can keep up of course.
By Camuel Gilyadov, on October 11th, 2010
Yes, it is.
Proof? – By definition.
But Wikipedia…… – fixed.
By Camuel Gilyadov, on October 8th, 2010
CAP theorem deals with trade-off in transactional system. It doesn’t need an introduction, unless of course you have been busy on the moon for last couple of years. In this case you can easily Google for good intros. Here is a wikipedia entry on the subject.
I was thinking how would I build an . . . → Read More: CAP equivalent for analytics?
By Camuel Gilyadov, on October 8th, 2010
Unsatisfied by my previous post‘s Advanced Analytics definition and giving it a thought of what is advanced methods in analytics I realized that analytics industry miss a good analytics pattern catalog. A list of common problems followed by a list of common industry-consensus solutions to them. An equivalent of GoF design patterns to analytics. The . . . → Read More: Analytics Patterns
By Camuel Gilyadov, on October 8th, 2010
Volume Scalability => the solution must handle high volumes of data, meaning the cost must scale linearly in the range of 10GB – 10PB. Latency Scalability => the solution must be interactive or batch, and cost must scale linearly in the range of 1 msec – 1 week. Sophistication Scalability => the solution . . . → Read More: Feature list of ultimate BigData analytics
By Camuel Gilyadov, on October 7th, 2010
I see a lot of confusion in the usage of newer terms in analytics. I do confuse them myself occasionally. I find it funny that the industry as serious as analytics tolerates constant renewal of its basic terminology. Yet, I confess, I’m very guilty of it myself. I do enjoy the freshness and the novelty . . . → Read More: Terminology: Analysis vs. analytics and more…
By Camuel Gilyadov, on October 1st, 2010
|
|
|
Last Comments