Beauty salons and plastic surgery clinics are now entering into the mobile app scene. And we take tremendous satisfaction in watching a client’s business grow and in knowing that we’re a part of their success.
In this video case-study, we’d like to show you an app that allows image editing almost on the fly. The graphics library we developed is a core part of the app – it uses mesh-warping algorithm (Bezier curve for warping the images and various effects) and bilinear interpolation (for filling in missing pixels).
Rotate and shrink photos, correct image distortion, clone areas, edit in portrait and landscape modes, zooming – everything in real-time. The client thinks it’s awesome and the App Store ratings too.
If you’re familiar with Google’s ‘Project Glass’, The Layar app and Sherlock (TV series), you already know what Augmented Reality is and how cool it looks, especially on mobile devices. Being a part of a parent concept called Mediated Reality, Augmented Reality is a view of a physical, real-world environment whose elements are augmented by a computer-generated sensory input – graphics, video sound, or GPS data.”
In this video, we’d like to showcase three core technologies we used while developing the iOS app for a client:
OpenCV – aka Open Source Computer Vision Library, a neat library of programming functions mainly designed for real-time computer vision and image processing.
ARToolKit – a software library specially for building Augmented Reality apps. ARToolKit allowed us to solve one known problem of AR development related to tracking the users’ viewpoint.
OpenGL – Open Graphics Library was developed in 1992 and it’s now is a standard specification defining a cross-language, multi-platform API for creating applications and simulating physics that create 2D and 3D computer graphics.
According to Urban Dictionary, it’s like a WOW feeling when you’ve experienced this augmented reality. And if you want to wow your users on a global scale, our mobile development team knows how to do it.
My recent blog (Big Data & NoSQL Technologies) discussed various NoSQL technologies and market vendors. Today let’s dive into column-oriented databases and why they should play an important role in any data warehouse whose focus is on aggregations or metrics (and whose isn’t?).
So you are all probably familiar with row-oriented databases. Tables of data where rows of fields (also called columns) represent the structural storage and the corresponding SQL queries that select, insert, update, and delete that data. Most database vendors like Oracle, Microsoft, Sybase, Informix, and many others all base their technology on this ANSI standard. Column-oriented databases are indeed what you might surmise; tables of data where columns of data values represent the structural storage. What you might not expect is that on the surface many column-oriented databases look and feel like row oriented databases also using SQL queries in much the same way. Creating tables, storing data, querying them are all pretty much identical. They may appear similar, but two principal things to understand is that the significant differences under the hood, in particular, physical storage and query optimization.
As noted in my previous blogs on NoSQL, there is also a column-store technology out there. Let’s not confuse that with column oriented databases. They are different. Since several NoSQL column-store vendors were highlighted before, we will focus instead on the column oriented database vendors here.
First, some key benefits to column oriented databases:
High performance on aggregation queries (like COUNT, SUM, AVG, MIN, MAX)
Highly efficient data compression and/or partitioning
True scalability and fast data loading for Big Data
Accessible by many 3rd party BI analytic tools
Fairly simple systems administration
Due to their aggregation capabilities which compute large numbers of similar data items, column oriented databases offer key advantages for certain types of systems, including:
Data Warehouses and Business Intelligence
Customer Relationship Management (CRM)
Library Card Catalogs
Ad hoc query systems
Column oriented database technology has actually been around for many years originating in 1969 with an application called TAXIR which provided abstracts for mathematical biosciences. In 1976, Canada implemented the RAPID system for processing and retrieval of population and housing census statistics. Sybase IQ was the only commercially available column-oriented database for many years, yet that has changed rapidly in the last few years. Let’s take a quick look at some of today’s key players:
A highly optimized analytics server designed specifically to deliver superior performance for mission-critical business intelligence, analytics and data warehousing solutions on any standard hardware and operating system. Its column oriented grid-based architecture, patented data compression, and advanced query optimizer delivers high performance, flexibility, and economy in challenging reporting and analytics environments.
Essentially a data partitioned, index based storage technology, Sybase IQ’s engine offers several key features which include:
Web enabled analytics
Communications & Security
Fast Data Loading
Query Engine supporting Full Text Search
Column Indexing Sub System
Column Storage Processor
User Friendly CUI based Administration & Monitoring
Multiplex Grid Architecture
Information Live-cycle management
The Sybase IQ Very Large Data Base (VLDB) option provides partitioning and placement where a table can have a specified column partition key with value ranges. This partition allows data that should be grouped together to be grouped together and separates data where they should be separated. The drawback to this methodology is that it is not always known which is which.
Offering both a commercial (IEE) and a free community (ICE) edition, the combination of a column oriented database with their Knowledge Grid architecture delivers a self-managed, scalable, high performance analytics query platform. Allowing 50Tb using a single server, their industry-leading data compression (10:1 up to 40:1) significantly reduces storage requirements and expensive hardware infrastructures. Delivered as a MySQL engine, Infobright runs on multiple operating systems and processors needing only a minimum of 4Gb of RAM (however 16Gb is a recommended starting point).
Avoiding partition schemes, Infobright data is stored in data packs, each node containing pre-aggregated statistics about the data stored within them. The Knowledge Grid above provides related metadata providing a high level view of the entire content of the database. Indexes, projections, partitioning or aggregated tables are not needed as these metadata statistics are managed automatically. The granular computing engine processes queries using the Knowledge Grid information to optimize query processing eliminating or significantly reducing the amount of data required for decompressing and access to answer a query. Some queries may not need to access the data at all, finding instead the answer in the Knowledge Grid itself.
The Infobright Data Loader is highly efficient so data inserts are very fast. This performance gain does come at a price so avoid updates unless absolutely necessary, design de-normalized tables, and don’t plan on any deletes. New features to the data loader include a reject option which allows valid rows to commit while invalid rows are logged. This is highly useful when loading millions of rows and only having a few rows with bad data. Without this feature the entire data load would be rolled back.
Recently acquired by Hewlett Packard, this platform was purpose built from the ground up to enable data values having high performance real-time analytics needs. With extensive data loading, queries, columnar storage, MPP architecture, and data compression features, diverse communities can develop and scale with a seamless integration ecosystem.
Claiming elasticity, scale, performance, and simplicity the Vertica analytics platform uses transformation partitioning to specify which rows belong together and parallelism for speed. Several key features include:
Columnar Storage & Execution
Real-Time Query & Loading
Scale-out MPP Architecture
Automatic High Availability
Aggressive Data Compression
Extensible In-Database Analytics Framework
In-Database Analytics Library
Database Designer & Administration Tools
Native BI & ETL support for MapReduce & Hadoop
The Vertica Optimizer is the brains of the analytics platform producing optimal query execution plans where several choices exist. It does this through traditional considerations like disk I/O and further incorporates CPU, memory, network, concurrency, parallelism factors and the unique details of the columnar operator and runtime environment.
Analytic-driven companies need a platform, not just a database where speed, agility, and complexity drive the data ecosystem. The ParAccel Analytic Platform streamlines the delivery of complex business decisions through its high performance analytic database. Designed for speed, its extensible framework supports on-demand integration and embedded functions.
The ParAccel Database (PADB) present four main components: the ‘Leader’ node, the ‘Compute’ node, the Parallel Communications Fabric, and an optional Storage Area Network (SAN). The ‘Leader’ controls the execution of the ‘Compute’ nodes and all nodes communicate with each other via the ‘Fabric’ running on standard x86 Linux servers. Each ‘Compute’ node is subdivided into a set of parallel processes called ‘slices’ that include a CPU core, and thier allocation of memory, and local disk storage. The ‘Communication Fabric’ provides a low-level MPP network protocol for increased performance.
Key PADB features include:
High Performance & Scalability
ParAccel Integrated Analytics Library and Extensibility Framework incorporates advanced functions along with an API to add your own functions to help address complex business problems right in the core database enabling customers to focus upon their specific data complexities.
Released this year, Microsoft has now embraced the columnar database idea. The latest SQL Server release 2012 includes xVelocity, a column-store index feature that stores data similar to a column-oriented DBMS. While not a true column oriented database, this technique allows for the creation of a memory optimized index that groups and stores data for each column then and joins them together to complete the index. For certain types of queries, like aggregations, the query processor can take advantage of the column-store index to significantly improve execution times. Column store indexes can be used with partitioned tables providing a new way to think about how to design and process large datasets.
The column-store index can be very useful on large fact tables in a Star schema improving overall performance, however the cost model approach utilized may choose the column-store index for a table when a row based index would have been better. Using the IGNORE_NONCLUSTERED_COLUMNSTORE_INDEX query hint will work around this if it occurs. When data is stored with a column-store index, data can often be compressed more effectively over a row based index. This is accomplished as typically there is more redundancy within a column than within a row. Higher compression means less IO is required to retrieve data into memory which can significantly reduce response times.
There are several restrictions and limitation in using a column-store index. For example, which data types are supported or not and that you can only create one column-store index on any table can be problematic. Become familiar with what it can do and where best to use it. Currently the column-store index is not supported on Microsoft Azure.
Column-oriented databases provide significant advantages over traditional row oriented system applied correctly; In particular for data warehouse and business intelligence environments where aggregations prevail. It would not be fair however to ignore the disadvantages. Let’s look at these two:
Efficient storage and data compression
Fast data loads
Fast aggregation queries
Simplified administration & configuration
Transactions are to be avoided or just not supported
Queries with table joins can reduce high performance
Record updates and deletes reduce storage efficiency
Effective partitioning/indexing schemes can be difficult to design
The real value in using column-oriented database technology comes from high performance, scalable storage and retrieval of large to massive datasets (Big Data) focused on aggregation queries. Simply put: Reports! You can design Star schema’s or Data Vaults (The Data Vault – What is it? – Why do we need it?) incorporating these technologies and you will find that column-oriented databases provide a clear solution in data warehouse and business intelligence.
Look for future blogs on Hadoop/Hive/HBase and Extract-Transform-Load (ETL) technologies, and don’t be afraid to comment, question, or debate, there is always room to learn new things…
Each year the Microsoft Worldwide Partner Conference (digitalwpc.com) - the company’s premier annual event for industry peers and partners - offers technology partners the opportunity to vie for awards in three different categories (Sales, Technology, and V-TSP achievements).
For a 3rd year in a row (we hear the sound of applause), Microsoft recognized our efforts in helping customers modernize their applications and infrastructure. Now in Toronto, Canada (July 8-12), DB Best Technologies (DBbest.com) received the Microsoft Platform Modernization Award for V-TSP Achievement, with special recognition for:
Database migration focus
Field sales and technology excellence
Consistently successful account engagements
Consistently high customer satisfaction
And the winner is…
“There was a lot of excitement this year around Microsoft products (SQL 2012, Azure, Windows 8, Office 15, Surface, Windows Phone),” said Dmitry Balin (@DBalin), CEO and co-founder of DB Best. Also “Hybrid IT” is a new and interesting concept that was discussed at the conference (#WPC12). With regards to data management, the idea is to keep data in the hybrid mode: on premises and in the cloud (SQL/Windows Azure).”
What Exactly We Did
Bringing incredible value to the companies.
DB Best strives to help customers better manage data, and get more value from the Microsoft data platform. Demonstrating a unique approach and exceptional dedication while working with the Microsoft account teams on successful deployments of the SQL Server 2012 and Azure-based applications, we deliver innovative solutions that address the technology and business challenges of its customers.
We’re looking forward to winning it again next year in Houston!