Broadgate Big Data Dictionary Part Two

Posted on : 31-08-2012 | By : richard.gale | In : Cloud, Data

Tags: , , , , ,


Last month we started our Big Data Dictionary to help define terms and concepts in the world of data management. Many of our readers have said it has been useful so we are continuing it this month and will focus more on some of the products and technologies that are emerging in this space.

Over the next few months we have some guest contributors penning their thoughts on the future for big data, analytics and data science.  Also don’t miss Tim Seears’s (TheBigDataPartnership) article on maximising value from your data “Feedback Loops” published here in June.

So continuing the theme from last month:


Hive provides a high level, simple, SQL type language to enable processing of and access to data stored in Hadoop files. Hive can provide analytical and business intelligence capability on top of Hadoop. The Hive queries are translated into a set of MapReduce jobs to run against the data. The technology is used by many large technology firms in their products including Facebook and Last.FM. The latency/batch related limitations of MapReduce are present in Hive too but the language allows non-Java programmers to access and manipulate large data sets in Hadoop.

Machine Learning

Machine learning is one of the most exciting concepts in the world of data. The idea is not new at all but the focus on utilising feedback loops of information and algorithms that take actions and change depending on the data without manual intervention could improve numerous business functions. The aim is to find new or previously unknown patterns & linkages between data items to obtain additional value and insight. An example of machine learning in action is Netflix which is constantly trying to improve its movie recommendation system based on a user’s previous viewing, their characteristics and also the features of their other customers with a similar set of attributes.


Mapreduce is a framework for processing large amounts of data across a large number of nodes or machines.

Map Reduce diagram (courtesy of Google)

Mapreduce works by splitting out (or mapping) requests into multiple separate tasks to be performed on many nodes of the system and then collates and summarises the results back (or reduces) to the outputs.

Mapreduce based on the java language and is the basis of a number of the higher level tools (Hive, Pig) used to access and manipulate large data sets.

Google (amongst others) developed and use this technology to process large amounts of data (such as documents and web pages trawled by its web crawling robots). It allows the complexity of parallel processing, data location and distribution and also system failures to be hidden or abstracted from the requester running the query.


MPP stands for massively parallel processing and it is the concept which gives the ability to process the volumes (and velocity and variety) of data flowing through systems. Chip processing capabilities are always increasing but to cope with the faster increasing amounts of data processing needs to be split across multiple engines. Technology that can split out requests into equal(ish) chunks of work, manage the processing and then join the results has been difficult to develop.  MPP can be centralised with a cluster of chips or machines in a single or closely coupled cluster or distributed where the power of many distributed machines are used (think ‘idle’ desktop PCs overnight usage as an example). Hadoop utilises many distributed systems for data storage and processing and also has fault tolerance built in which enables processing to continue with the loss of some of those machines.


NoSQL really means ‘not only SQL’, it is the term used for database management systems that do not conform to the traditional RDBMS model (transactional oriented data management systems based on the ACID principle). These systems were developed by technology companies in response to challenges raised by the high volumes of data. Amazon, Google and Yahoo built NoSQL systems to cope with the tidal wave of data generated by their users.


Apache Pig is a platform for analysing huge data sets. It has a high-level language called Pig Latin which is combined with a data management infrastructure which allows high levels of parallel processing. Again, like Hive, the Pig Latin is compiled into MapReduce requests. Pig is also flexible so additional functions and processing can be added by users for their own specific needs.

Real Time

The challenges in processing the “V”‘s in big data (volume, velocity and variety) have meant that some requirements have been compromised. It the case of Hadoop and Mapreduce this has been the interactive or instant availability of the results. Mapreduce is batch orientated in the sense that requests are sent for processing where they are then scheduled to be run and then the output summarised. This works fine for the original purposes but now the ability to become more real-time or interactive are growing. With a ‘traditional’ database or application users expect the results to be available instantly or pretty close to instant. Google and others are developing more interactive interfaces to Hadoop. Google has Drill and Twitter has release Storm. We see this as one of the most interesting areas of development in the Big Data space at the moment.

These are our thoughts on the products and technologies – we would welcome any challenges or corrections and will work them into the articles.



No-win/No-fee: Cost Takeout Services

Posted on : 31-08-2012 | By : jo.rose | In : Finance

Tags: , , , , , , ,


Enterprises are always under pressure to deliver IT value and cost reduction; even more so in today’s climate. Recognising that both resources are scarce and there is no budget to “spend money to save money”, we have developed a model of cost takeout and recovery services that works on risk and reward basis (no win-no fee).

Our partners are currently delivering multi-millions in recovery or avoidance in the four key areas of Fixed Line Telecom, Mobile Tariffs, Software Licensing and IT Assets and Infrastructure.

Some characteristics of the offering;

  • The service is a “light touch” approach. We do not require significant client resources but access and authority to artefacts and contracts which underpin the associated focus area.
  • We have developed this service through years of experience in driving efficiencies and is underpinned by expert staff in financial and contractual management.
  • We agree the success criteria and shared reward targets upfront with our clients.
  • In some areas, such as mobile phone tariffs, costs can be reclaimed retrospectively for up to six years.
  • The only risk to the client is doing nothing!


If this is something that may be of interest please contact Jo Rose to discuss how our approach could help your organisation.





What does the future hold for Retail Banking?

Posted on : 30-08-2012 | By : jo.rose | In : Finance

Tags: , , , , , , ,


A recent survey found that a majority of consumers would prefer to have their bank accounts run by the likes of John Lewis, Waitrose or Amazon. Not surprising given that every time you open the paper there is another example of the industry under fire from all corners, be it regulators, politicians, the media and in some cases, themselves.

Indeed, taking a look at Yougov’s BrandIndex for Natwest and Barclays during the IT systems failure and the Libor scandal, they plummeted to -63 and -59 on the daily Buzz and -22 and -24 on Brand Perception respectively (the index is comprised of six key image measures). Not far short of BP during the 2010 oil spill.

In the survey over 75% stated they would switch to a more reputable brand, with less than a third actually satisfied with their banks customer service. The industry is under fire and it’s future difficult to predict in terms of shape, size and competition. What is certain is that in order to maintain a profitable and sustainable retail banking model things need to change.

So, what are the issues and trends facing the industry?

  • Regulatory and Political: increased regulation from the likes of Basel III, RDR, FATCA and Mifid II, implications of Leveson inquiry, capital and liquidity requirements
  • Economic Pressure and Competition: impacts of Eurozone crisis, a shift sovereign wealth distribution, lower margins, increased competition from incumbents and new entrants to the market
  • Products and Channels: how to meet customer expectations and needs in terms of areas such as mobile payments, increased transparency for customer products and more innovation
  • Operating Model: increased pressure on efficiency for operations, agility in product launch and technology commoditisation/consumerisation

All of these areas, combined with the decrease in trust and a requirement to “ease” customer switching processes between banks, are really putting a squeeze on the industry.

On the competition side, there are real threats coming to the traditional high street banks. We have already seen new entrants in the form of Metro Bank which as well as challenging the customer service experience, have benefited from not having the legacy/complexity of IT systems so can provide a more integrated experience to the customer.

Other players such as Tesco and Virgin Money are imminently entering the current account and mortgage markets, both benefiting from brand affinity and cross selling opportunities through customer analytics.

Alongside this there is a rise in peer-to-peer lenders and investment vehicles (such as Zoopla and Thincats) who have capitalised on the market situation, low cost base and limited regulation to provide competitive services.

In terms of innovation of products and channels, there have been some examples recently of developments in this area;

  • RBS: launched “Get Cash” technology in 2012 that allows for the ability to withdraw up to £100 in case without the need for a card through a secure pass code
  • Barclays: launched “Pingit” mobile payments service, allowing  on a peer-to-peer transactions for UK customers via mobile phones
  • Banco Bradesco: opened “bank of the future” with state of the art technology including biometric ATM’s, smart walls, avatars and robot greeters

However, alongside this innovation, we expect the whole non-bank payments industry to provide significant challenges and competition. Companies like Paypal have a strong foothold and new entrants such as Monitise are increasing share in the mobile payments space rapidly, both organically and through acquisition.

Then consider this against the likelihood of “Generation Z” (usually categorised as those born after 1995) relying on their smart phones entirely for payments i.e. not having a requirement for traditional bank accounts but a strong desire (indeed expectation) of services being provided through existing and emerging peer/social media technology channels.

So it’s going to be tough for a few years…and only those banks that react to the challenges and evolve through this period will survive.