Artificial Intelligence – Explaining the Unexplainable

Posted on : 23-09-2019 | By : kerry.housley | In : Finance, FinTech, General News, Innovation

Tags: , , , , , ,

0

The rise of Artificial Intelligence (AI) is dramatically changing the way businesses operate and provide their services. The acceleration of intelligent automation is enabling companies to operate more efficiently, promote growth, deliver greater customer satisfaction and drive up profits. But what exactly is AI? How does it reach its decisions? How can we be sure it follows all corporate, regulatory and ethical guideline? Do we need more human control? 

Is it time for AI to explain itself? 

The enhancement of human intelligence with AI’s speed and precisiomeans a gigantic leap forward for productivity. The ability to feed data into an algorithm black box and return results in a fraction of the time a human could compute, is no longer sci fi fantasy but now a reality.  

However, not everyone talks about AI with such enthusiasmCritics are concerned that the adoption of AI machines will lead to the decline of the human role rather than freedom and enhancement for workers.   

Ian McEwan in his latest novel Machines Like Me writes about a world where machines take over in the face of human decline. He questions machine learning referring to it as

“the triumph of humanism or the angel of death?” 

Whatever your view, we are not staring at the angel of death just yet!  AI has the power to drive a future full of potential and amazing discovery. If we consider carefully all the aspects of AI and its effects, then we can attempt to create a world where AI works for us and not against us. 

Let us move away from the hype and consider in real terms the implications of the shift from humans to machines. What does this really mean? How far does the shift go?  

If we are to operate in world where we are relying on decisions made by software, we must understand how this decision is calculated in order to have faith in the result.   

In the beginning the AI algorithms were relatively simple as humans learned how to define them. As time has moved on, algorithms have evolved and become more complex. If you add to this machine learning, and we have a situation where we have machines that can “learn behaviour patterns thereby altering the original algorithm. As humans don’t have access to the algorithms black box we are no longer in charge of the process.   

The danger is that where we do not understand what is going on in the black box and can therefore no longer be confident in the results produced.

If we have no idea how the results are calculated, then we have lost trust in the process. Trust is the key element for any business, and indeed for society at large. There is a growing consensus around the need for AI to be more transparent. Companies need to have a greater understanding of their AI machines. Explainable AI is the idea that an AI algorithm should be able to explain how it reached its conclusion in a way that humans can understand. Often, we can determine the outcome but cannot explain how it got there!  

Where that is the case, how can we trust the result to be true, and how can we trust the result to be unbiased?  The impact of this is not the same in every case, it depends on whether we are talking about low impact or high impact outcomes. For example, an algorithm that decides what time you should eat your breakfast is clearly not as critical as an algorithm which determines what medical treatment you should have.  

As we witness a greater shift from humans to machines, the greater the need for the explainability.  

Consensus for more explainable AI is one thing, achieving it is quite another. Governance is an imperative, but how can we expect regulators to dig deep into these algorithms to check that they comply, when the technologists themselves don’t understand how to do this. 

One way forward could be a “by design” approach – i.e., think about the explainable element at the start of the process. It may not be possible to identify each and every step once machine learning is introduced but a good business process map will help the users the define process steps.  

The US government have been concerned about this lack of transparency for some time and have introduced the Algorithmic Accountability Act 2019. The Act looks at automated decision making and will require companies to show how their systems have been designed and built. It only applies to the large tech companies with turnover of more than $50M dollars, but it provides a good example that all companies would be wise to follow.  

Here in the UK, the Financial Conduct Authority is working very closely with the Alan Turing Institute to ascertain what the role of the regulator should be and how governance can be  appropriately introduced.

The question is how explainable and how accurate the explanation needs to be in each case, depending on the risk and the impact.  

With AI moving to ever increasing complexity levels, its crucial to understand how we get to the results in order to trust the outcome. Trust really is the basis of any AI operation. Everyone one involved in the process needs to have confidence in the result and know that AI is making the right decision, avoiding manipulationbias and respecting ethical practices. It is crucial that the AI operates within public acceptable boundaries.  

Explainable AI is the way forward if we want to follow good practice guidelines, enable regulatory control and most importantly build up trust so that the customer always has confidence in the outcome.   

AI is not about delegating to robots, it is about helping people to achieve more precise outcomes more efficiently and more quickly.  

If we are to ensure that AI operates within boundaries that humans expect then we need human oversight at every step. 

What will the IT department look like in the future?

Posted on : 29-01-2019 | By : john.vincent | In : Cloud, Data, General News, Innovation

Tags: , , , , , , , , , ,

0

We are going through a significant change in how technology services are delivered as we stride further into the latest phase of the Digital Revolution. The internet provided the starting pistol for this phase and now access to new technology, data and services is accelerating at breakneck speed.

More recently the real enablers of a more agile and service-based technology have been the introduction of virtualisation and orchestration technologies which allowed for compute to be tapped into on demand and removed the friction between software and hardware.

The impact of this cannot be underestimated. The removal of the needed to manually configure and provision new compute environments was a huge step forwards, and one which continues with developments in Infrastructure as Code (“IaC”), micro services and server-less technology.

However, whilst these technologies continually disrupt the market, the corresponding changes to the overall operating models has in our view lagged (this is particularly true in larger organisations which have struggled to shift from the old to the new).

If you take a peek into organisation structures today they often still resemble those of the late 90’s where capabilities in infrastructure were organised by specialists such as data centre, storage, service management, application support etc. There have been changes, specifically more recently with the shift to devops and continuous integration and development, but there is still a long way go.

Our recent Technology Futures Survey provided a great insight into how our clients (290) are responding to the shifting technology services landscape.

“What will your IT department look like in 5-7 years’ time?”

There were no surprises in the large majority of respondents agreeing that the organisation would look different in the near future. The big shift is to a more service focused, vendor led technology model, with between 53%-65% believing that this is the direction of travel.

One surprise was a relatively low consensus on the impact that Artificial Intelligence (“AI”) would have on management of live services, with only 10% saying it would be very likely. However, the providers of technology and services formed a smaller proportion of our respondents (28%) and naturally were more positive about the impact of AI.

The Broadgate view is that the changing shape of digital service delivery is challenging previous models and applying tension to organisations and providers alike.  There are two main areas where we see this;

  1. With the shift to cloud based and on-demand services, the need for any provider, whether internal or external, has diminished
  2. Automation, AI and machine learning are developing new capabilities in self-managing technology services

We expect that the technology organisation will shift to focus more on business products and procuring the best fit service providers. Central to this is AI and ML which, where truly intelligent (and not just marketing), can create a self-healing and dynamic compute capability with limited human intervention.

Cloud, machine learning and RPA will remove much of the need to manage and develop code

To really understand how the organisation model is shifting, we have to look at the impact that technology is having the on the whole supply chain. We’ve long outsourced the delivery of services. However, if we look the traditional service providers (IBM, DXC, TCS, Cognizant etc.) that in the first instance acted as brokers to this new digital technology innovations we see that they are increasingly being disintermediated, with provisioning and management now directly in the hands of the consumer.

Companies like Microsoft, Google and Amazon have superior technical expertise and they are continuing to expose these directly to the end consumer. Thus, the IT department needs to think less about how to either build or procure from a third party, but more how to build a framework of services which “knits together” a service model which can best meet their business needs with a layered, end-to-end approach. This fits perfectly with a more business product centric approach.

We don’t see an increase for in-house technology footprints with maybe the exception of truly data driven organisations or tech companies themselves.

In our results, the removal of cyber security issues was endorsed by 28% with a further 41% believing that this was a possible outcome. This represents a leap of faith given the current battle that organisations are undertaking to combat data breaches! Broadgate expect that organisations will increasingly shift the management of these security risks to third party providers, with telecommunication carriers also taking more responsibilities over time.

As the results suggest, the commercial and vendor management aspects of the IT department will become more important. This is often a skill which is absent in current companies, so a conscious strategy to develop capability is needed.

Organisations should update their operating model to reflect the changing shape of technology services, with the closer alignment of products and services to technology provision never being as important as it is today.

Indeed, our view is that even if your model serves you well today, by 2022 it is likely to look fairly stale. This is because what your company currently offers to your customers is almost certain to change, which will require fundamental re-engineering across, and around, the entire IT stack.

Selecting a new “digitally focused” sourcing partner

Posted on : 18-07-2018 | By : john.vincent | In : Cloud, FinTech, Innovation, Uncategorized

Tags: , , , , , ,

0

It was interesting to see the recent figures this month from the ISG Index, showing that the traditional outsourcing market in EMEA has rebounded. Figures for the second quarter for commercial outsourcing contracts show a combined annual contract value (ACV) of €3.7Bn. This is significantly up 23% on 2017 and for the traditional sourcing market, reverses a downward trend which had persisted for the previous four quarters.

This is an interesting change of direction, particularly against a backdrop of economic uncertainty around Brexit and the much “over indulged”, GDPR preparation. It seems that despite this, rather than hunkering down with a tin hat and stockpiling rations, companies in EMEA have invested in their technology service provision to support an agile digital growth for the future. The global number also accelerated, up 31% to a record ACV of €9.9Bn.

Underpinning some of these figures has been a huge acceleration in the As-a-Service market. In the last 2 years the ACV attributed to SaaS and IaaS has almost doubled. This has been fairly consistent across all sectors.

So when selecting a sourcing partner, what should companies consider outside of the usual criteria including size, capability, cultural fit, industry experience, flexibility, cost and so on?

One aspect that is interesting from these figures is the influence that technologies such as cloud based services, automation (including AI) and robotic process automation (RPA) are having both now and in the years to come. Many organisations have used sourcing models to fix costs and benefit from labour arbitrage as a pass-through from suppliers. Indeed, this shift of labour ownership has fuelled incredible growth within some of the service providers. For example, Tata Consultancy Services (TCS) has grown from 45.7k employees in 2005 to 394k in March 2018.

However, having reached this heady number if staff, the technologies mentioned previously are threatening the model of some of these companies. As-a-Service providers such as Microsoft Azure and Amazon AWS have platforms now which are carving their way through technology service provision, which previously would have been managed by human beings.

In the infrastructure space commoditisation is well under way. Indeed, we predict that the within 3 years the build, configure and manage skills in areas such Windows and Linux platforms will be rarely in demand. DevOps models, and variants of, are moving at a rapid pace with tools to support spinning up platforms on demand to support application services now mainstream. Service providers often focus on their technology overlay “value add” in this space, with portals or orchestration products which can manage cloud services. However, the value of these is often questionable over direct access or through commercial 3rd party products.

Secondly, as we’ve discussed here before, technology advances in RPA, machine learning and AI are transforming service provision. This of course is not just in terms of business applications but also in terms of the underpinning services. This is translating itself into areas such as self-service Bots which can be queried by end users to provide solutions and guidance, or self-learning AI processes which can predict potential system failures before they occur and take preventative actions.

These advances present a challenge to the workforce focused outsource providers.

Given the factors above, and the market shift, it is important that companies take these into account when selecting a technology service provider. Questions to consider are;

  • What are their strategic relationships with cloud providers, and not just at the “corporate” level, but do they have in depth knowledge of the whole technology ecosystem at a low level?
  • Can they demonstrate skills in the orchestration and automation of platforms at an “infrastructure as a code” level?
  • Do they have capability to deliver process automation through techniques such as Bots, can they scale to enterprise and where are their RPA alliances?
  • Does the potential partner have domain expertise and open to partnership around new products and shared reward/JV models?

The traditional sourcing engagement models are evolving which has developed new opportunities on both sides. Expect new entrants, without the technical debt, organisational overheads and with a more technology solution focus to disrupt the market.

The Opportunity for Intelligent Process Automation in KYC / AML

Posted on : 28-06-2018 | By : richard.gale | In : compliance, Data, Finance, FinTech, Innovation

Tags: , , , , , , , , , , ,

0

Financial services firms have had a preoccupation with meeting the rules and regulations for fighting Financial Crime for the best part of the past decade. Ever since HSBC received sanction from both UK and US regulators in 2010, many other firms have also been caught short in failing to meet society’s expectations in this space. There have been huge programmes of change and remediation, amounting to 10’s of Billions of any currency you choose, to try to get Anti-Financial Crime (AFC) or Know Your Customer (KYC) / Anti-Money Laundering (AML) policies, risk methodologies, data sources, processes, organisation structures, systems and client populations into shape, at least to be able to meet the expectations of regulators, if not exactly stop financial crime.

The challenge for the industry is that Financial Crime is a massive and complex problem to solve. It is not just the detection and prevention of money laundering, but also needs to cover terrorist financing, bribery & corruption and tax evasion. Therefore, as the Banks, Asset Managers and Insurers have been doing, there is a need to focus upon all elements of the AFC regime, from education to process, and all the other activities in-between. Estimates as to the scale of the problem vary but the consensus is that somewhere between $3-5 trillion is introduced into the financial systems each year.

However, progress is being made. Harmonisation and clarity of industry standards and more consistency has come from the regulators with initiatives such as the 4th EU AML Directive. The appreciation and understanding of the importance of the controls are certainly better understood within Financial Services firms and by their shareholders. Perhaps what has not yet progressed significantly are the processes of performing client due diligence and monitoring of their subsequent activity. Most would argue that this is down to a number of factors, possibly the greatest challenge being the disparate and inconsistent nature of the data required to support these processes. Data needs to be sourced in many formats from country registries, stock exchanges, documents of incorporation, multiple media sources etc… Still today many firms have a predominantly manual process to achieve this, even when much of the data is available in digital form. Many still do not automatically ingest data into their work flows and have poorly defined processes to progress onboarding, or monitoring activities. This is for the regulations as they stand today, in the future this burden will further increase as firms will be expected to take all possible efforts to determine the integrity of their clients i.e. by establishing linkages to bad actors through other data sources such as social media and the dark web not evident in traditional sources such as company registries.

There have been several advances in recent years with technologies that have enormous potential for supporting the AFC cause. Data vendors have made big improvements in providing a broader and higher quality of data. The Aggregation solutions, such as Encompass offer services where the constituents of a corporate ownership structure can be assembled, and sanctions & PEP checks undertaken in seconds, rather than the current norm of multiple hours. This works well where the data is available from a reliable electronic source. However, does not work where there are no, or unreliable sources of digital data, as is the case for Trusts or in many jurisdictions around the world. Here we quickly get back to the world of paper and PDFs’ which still require human horsepower to review and decision.

Getting the information in the first instance can be very time consuming with complex interactions between multiple parties (relationship managers, clients, lawyers, data vendors, compliance teams etc) and multiple communications channels i.e. voice, email and chat in its various forms. We also have the challenge of Adverse Media, where thousands of news stories are generated every day on Corporates and Individuals that are the clients of Financial firms. The news items can be positive or negative but consumes tens of thousands of people to review, eliminate or investigate this mountain of data each day. The same challenges come with transaction monitoring, where individual firms can have thousands of ‘hits’ every day on ‘unusual’ payment patterns or ‘questionable’ beneficiaries. These also require review, repair, discounting or further investigation, the clear majority of which are false positives that can be readily discarded.

What is probably the most interesting opportunity for allowing the industry to see the wood for the trees in this data heavy world, is the maturing of Artificial Intelligence (AI) based, or ‘Intelligent’ solutions. The combination of Natural Language Processing with Machine Learning can help the human find the needles in the haystack or make sense of unstructured data that would ordinarily require much time to read and record. AI on its own is not a solution but combined with process management (workflow) and digitised, multi-channel communications, and even Robotics can achieve significant advances. In summary ‘Intelligent’ processing can address 3 of the main data challenges with the AFC regimes within financial institutions;

  1. Sourcing the right data – Where data is structured and digitally obtainable it can be readily harvested but needs to be integrated into the process flows to be compared, analysed, accepted or rejected as part of a review process. Here AI can be used to perform these comparisons, support analysis and look for patterns of common or disparate Data. Where the data is unstructured i.e. embedded in a paper document (email / PDF / doc etc.) then AI NLP and Machine Learning can be used to extract the relevant data and turn the unstructured into structured form for onward processing
  2. Filtering – with both Transaction Monitoring and Adverse Media reviews there is a tsunami of data and events presented to Compliance and Operations teams for sifting, reviewing, rejecting or further investigation. The use of AI can be extremely effective at performing this sifting and presenting back only relevant results to users. Done correctly this can reduce this burden by 90+% but perhaps more importantly, never miss or overlook a case so providing reassurance that relevant data is being captured
  3. By using Intelligent workflows, processes can be fully automated where simple decision making is supported by AI, thereby removing the need for manual intervention in many tasks being processed. Leaving the human to provide value in the complex end of problem solving

Solutions are now emerging in the industry, such as OPSMATiX, one of the first Intelligent Process Automation (IPA) solutions. Devised by a group of industry business experts as a set of technologies that combine to make sense of data across different communication channels, uses AI to turn the unstructured data into structured, and applies robust workflows to optimally manage the resolution of cases, exceptions and issues. The data vendors, and solution vendors such as Encompass are also embracing AI techniques and technologies to effectively create ‘smart filters’ that can be used to scour through thousands, if not millions of pieces of news and other media to discover, or discount information of interest. This can be achieved in a tiny fraction of the time, and therefore cost, and more importantly with far better accuracy than the human can achieve. The outcome of this will be to liberate the human from the process, and firms can either choose to reduce the costs of their operations or use people more effectively to investigate and analyse those events, information and clients that maybe of genuine cause for concern, rather than deal with the noise.

Only once the process has been made significantly more efficient, and the data brought under control can Financial firms really start to address the insidious business of financial crime. Currently all the effort is still going into meeting the regulations, and not societies actual demand which is to combat this global menace, Intelligent process should unlock this capability

 

Guest Author : David Deane, Managing Partner of FIMATIX and CEO of OPSMATiX. David has had a long and illustrious career within Operations and Technology global leadership with Wholesale Banks and Wealth Managers. Before creating FIMATIX and OPSMATiX, he was recently the Global Head of KYC / AML Operations for a Tier 1 Wholesale Bank.

david.deane@fimatix.com

AI Evolution: Survival of the Smartest

Posted on : 21-05-2018 | By : richard.gale | In : Innovation, Predictions

Tags: , , , , ,

0

Artificial intelligence is getting very good at identifying things: Let it analyse a million pictures, and it can tell with amazing accuracy which show a child crossing the road. But AI is hopeless at generating images of people or whatever by itself. If it could do that, it would be able to create visions of realistic but synthetic pictures depicting people in various settings, which a self-driving car could use to train itself without ever going out on the road.

The problem is, creating something entirely new requires imaginationand until now that has been a step to far for machine learning.

There is an emerging solution first conceived by  Ian Goodfellow during an academic argument in a bar in 2014… The approach, known as a generative adversarial network, or “GAN”, takes two neural networksthe simplified mathematical models of the human brain that underpin most modern machine learningand pits them against each other to identify flaws and gaps in the others thought model.

Both networks are trained on the same data set. One, known as the generator, is tasked with creating variations on images it’s already seenperhaps a picture of a pedestrian with an extra arm. The second, known as the discriminator, is asked to identify whether the example it sees is like the images it has been trained on or a fake produced by the generatorbasically, is that three-armed person likely to be real?

Over time, the generator can become so good at producing images that the discriminator can’t spot fakes. Essentially, the generator has been taught to recognize, and then create, realistic-looking images of pedestrians.

The technology has become one of the most promising advances in AI in the past decade, able to help machines produce results that fool even humans.

GANs have been put to use creating realistic-sounding speech and photo realistic fake imagery. In one compelling example, researchers from chipmaker Nvidia primed a GAN with celebrity photographs to create hundreds of credible faces of people who don’t exist. Another research group made not-unconvincing fake paintings that look like the works of van Gogh. Pushed further, GANs can reimagine images in different waysmaking a sunny road appear snowy, or turning horses into zebras.

The results aren’t always perfect: GANs can conjure up bicycles with two sets of handlebars, say, or faces with eyebrows in the wrong place. But because the images and sounds are often startlingly realistic, some experts believe there’s a sense in which GANs are beginning to understand the underlying structure of the world they see and hear. And that means AI may gain, along with a sense of imagination, a more independent ability to make sense of what it sees in the world. 

This approach is starting to provide programmed machines with something along the lines of imagination. This, in turn, will make them less reliant on human help to differentiate. It will also help blur the lines between what is real and what is fake… And in an age where we are already plagued with ‘fake news’ and doctored pictures are we ready for seemingly real but constructed images and voices….

Let’s think Intelligently about AI.

Posted on : 17-01-2017 | By : richard.gale | In : Uncategorized

Tags: , , , ,

0

Currently there is a daily avalanche of artificial intelligence (AI) related news clogging the internet. Almost every new product, service or feature has an AI, ‘Machine Learning’ or ‘Robo something’  angle to it. So what is so great about AI? What is different about it and how can it improve the way we live and work? We think there has been an over emphasis on ‘machine learning’ relying on crunching huge amounts of information via a set of algorithms. The actual ‘intelligence’ part has been overlooked, the unsupervised way humans learn through observation and modifying our behaviour based on changes to our actions is missing. Most ‘AI’ tools today work well but have a very narrow range of abilities and have no ability to really think creatively and as wide ranging as a human (or animal) brain.

Origins

Artificial Intelligence as a concept has been around for hundreds of years. That human thought, learning, reasoning and creativity could be replicated in some form of machine. AI as an academic practice really grew out of the early computing concepts of Alan Turing and the first AI research lab was created in Dartmouth college in 1956. The objective seemed simple, create a machine as intelligent as a human being. The original team quickly found they had grossly underestimated the complexity of the task and progress in AI moved gradually forward over the next 50 years.

Although there are a number of approaches to AI, all generally rely on learning, processing information about the environment, how it changes, the  frequency and type of inputs experienced. This can result in a huge amount of data to be absorbed. The combination of the availability of vast amounts of computing power & storage with massive amounts of information (from computer searches and interaction) has enabled AI, sometimes known as machine learning to gather pace. There are three main types of learning in AI;

  • Reinforcement learning — This is focused on the problem of how an AI tool ought to act in order to maximise the chance of solving a problem. In a particular situation, the machine picks an action or a sequence of actions, and progresses. This is frequently used when teaching machines to play and win chess games. One issue is that in its purest form, reinforcement learning requires an extremely large number of repetitions to achieve a level of success.
  • Supervised learning —  The programme is told what the correct answer is for a particular input: here is the image of a kettle the correct answer is “kettle.” It is called supervised learning because the process of an algorithm learning from the labelled training data-set is similar to showing a picture book to a young child. The adult knows the correct answer and the child makes predictions based on previous examples. This is the most common technique for training neural networks and other machine learning architectures. An example might be: Given the descriptions of a large number of houses in your town together with their prices, try to predict the selling price of your own home.
  • Unsupervised learning / predictive learning — Much of what humans and animals learn, they learn it in the first hours, days, months, and years of their lives in an unsupervised manner: we learn how the world works by observing it and seeing the result of our actions. No one is here to tell us the name and function of every object we perceive. We learn very basic concepts, like the fact that the world is three-dimensional, that objects don’t disappear spontaneously, that objects that are not supported fall. We do not know how to do this with machines at the moment, at least not at the level that humans and animals can. Our lack of AI technique for unsupervised or predictive learning is one of the factors that limits the progress of AI at the moment.

How useful is AI?

We are constantly interacting with AI. There are a multitude of programmes, working, helping and predicting  your next move (or at least trying to). Working out the best route is an obvious one where Google uses feedback from thousands of other live and historic journeys to route you the most efficient way to work. It then updates its algorithms based on the results from yours. Ad choices, ‘people also liked/went on to buy’ all assist in some ways to make our lives ‘easier’. The way you spend money is predictable so any unusual behaviour can result in a call from your bank to check a transaction. Weather forecasting uses machine learning (and an enormous amount of processing power combined with historic data) to provide improving short and medium term forecasts.

One of the limitations with current reinforcement and supervised models of learning is that, although we can build a highly intelligent device it has very limited focus. The chess computer ‘Deep Blue’ could beat Grand-master human chess players but, unlike them, it cannot drive a car, open a window or describe the beauty of a painting.

What’s next?

So could a machine ever duplicate or move beyond the capabilities of a human brain. The short answer is ‘of course’. Another short answer is ‘never’… Computers and programmes are getting more powerful, sophisticated and consistent each year. The amount of digital data is doubling on a yearly basis and the reach of devices is expanding at extreme pace. What does that mean for us? Who knows is the honest answer. AI and intelligent machines will become a part of all our daily life but the creativity of humans should ensure we partner and use them to enrich and improve our lives and environment.

Deep Learning‘ is the latest buzz term in AI. Some researchers explain this as ‘working just like the brain’ a better explanation from Jan LeCun (Head of AI at Facebook) is ‘machines that learn to represent the world’. So more general purpose machine learning tools rather than highly specialised single purpose ones. We see this as the next likely direction for AI in the same way, perhaps, that the general purpose Personal Computer (PC) transformed computing from dedicated single purpose to multi-purpose business tools.

 

Feedback Loops: How to maximise value from your Big Data

Posted on : 27-06-2012 | By : richard.gale | In : General News

Tags: , , , , , , , , , , , , , , , , ,

3

Closing the feedback loop

With businesses becoming increasingly sensitive to customer opinion of their brand, monitoring consumer feedback is becoming ever more important.  Additionally, the recognition of social media as an important and valid source of customer opinion has brought about a need for new systems and a new approach.

Traditional approaches of reactive response to any press coverage by a PR department, or conducting infrequent customer surveys whether online or by phone are all part of extremely slow-cycle feedback loops, no longer adequate to capture the ever-changing shifts in public sentiment.

They represent a huge hindrance to any business looking to improve brand relations; delay in feedback can cost real money.  Inevitably, the manual sections of traditional approaches create huge delays in information reaching its users.  These days, we need constant feedback and we need low-latency – the information needs to be almost real-time.  Wait a few moments too long, and suddenly the intelligence you captured could be stale and useless.

 

A social media listening post

Witness the rise of the “social media listening post”: a new breed of system designed to plug directly in to social networks, constantly watching for brand feedback automatically around the clock.  Some forward-thinking companies have already built such systems.  How does yours keep track right now?  If your competitors have it and you don’t, does that give them a competitive advantage over you?

I’d argue for the need for most big brands to have such a system these days.  Gone are the days when businesses could wait months for surveys or focus groups to trickle back with a sampled response from a small select group.  In that time, your brand could have been suffering ongoing damage, and by the time you find out, valuable customers have been lost.  Intelligence is readily available these days on an near-instantaneous basis, can you afford not to use it?

Some emerging “Big Data” platforms offer the perfect tool for monitoring public sentiment toward a company or brand, even in the face of the rapid explosion in data volumes from social media, which could easily overwhelm traditional BI analytics tools.  By implementing a social media “listening post” on cutting-edge Big Data technology, organisations now have the opportunity to unlock a new dimension in customer feedback and insight into public sentiment toward their brands.

Primarily, we must design the platform for low-latency continuous operation to allow complete closure of the feedback loop – that is to say, events (news, ad campaigns etc) can be monitored for near-real time positive/negative/neutral response by the public – thus bringing rapid response, corrections in strategy, into the realm of possibility.  Maybe you could just pull that new ad campaign early if something disastrous and unexpected happened to public reaction to the material?  It’s also about understanding trends and topics of interest to a brand audience, and who are the influencers.  Social media platforms like Twitter offer a rich granular model for exploring this complex web of social influence.

The three main challenges inherent in implementing a social media listening post are:

  • Data volume
  • Complexity of data integration – e.g. unstructured, semi-structured, evolving schema etc
  • Complexity of analysis – e.g. determining sentiment: is it really a positive or negative statement with respect to the brand?

To gain a complete picture of public opinion towards your brand or organisation through social media, many millions of web sites and data services must be consumed, continuously around the clock.  They need to be analysed in complex ways, far beyond traditional data warehouse query functionality.  Even just a sentiment analysis capability on its’ own poses a considerable challenge, and as a science is still an emerging discipline, but even more advanced techniques in Machine Learning may prove necessary to correctly interpret all signals from the data.  Data format will vary greatly among social media sources, ranging from regular ‘structured’ data through semi-and unstructured forms, to complex poly-structured data with many dimensions.  This structural complexity poses extreme difficulty for traditional data warehouses and up-front ETL (Extract-Transform-Load) approaches, and demands a far more flexible data consumption platform.

So how do we architect a system like this?  Generally speaking, at its core you will need some kind of distributed data capture and analysis platform.  Big Data platforms were designed to address problems where you have Volume, Variety, or Velocity of data – and most often, all three.  In this particular use-case, we need to look towards the cutting-edge of the technology, and look for a platform which supports near-real time, streaming data capture and analysis, with the capability to implement Machine Learning algorithms for the analytics/sentiment analysis component.

For the back-end, a high-throughput data capture/store/query capability is required, suitable for continuous streaming operation, probably with redundancy/high-availability, and a non-rigid schema layer capable of evolving over time as the data sources evolve.  So-called “No-SQL” database systems (which in fact stands for “Not Only SQL” rather than NO SQL) such as Cassandra, HBase or MongoDB offer excellent properties for high-volume streaming operation, and would be well suited to the challenge, or there are also commercial derviatives of some of these platforms on the market, such as the excellent Acunu Data Platform which commercialises Cassandra.

Additionally a facility for complex analytics, most likely via parallel, shared-nothing computation (due to the extreme data volumes) will be required to derive any useful insight from the data you capture.  For this component, paradigms like MapReduce are a natural choice, offering the benefits of linear scalability and unlimited flexibility in implementing custom algorithms, and libraries of Machine Learning such as the great Apache Mahout project have grown up around providing a toolbox of analytics on top of the MapReduce programming model.  Hadoop is an obvious choice when it comes to exploiting the MapReduce model, but since the objective here is to achieve near-real time streaming capability, it may not always be the best choice.  Cassandra and HBase (which in fact runs on Hadoop) can be a good choice since they offer the low-latency characteristics, coupled with MapReduce analytic capabilities.

Finally, some form of front-end visualization/analysis layer will be necessary to graph and present results in a usable visual form.  There are some new open-source BI Analytics tools around which might do the job, or a variety of commercial offerings in this area.  The exact package to be selected for this component is strongly dependent on the desired insight and form of visualization and so is probably beyond the scope of this article, but of course requirements are clearly that it needs to interface with whatever back-end storage layer you choose.

Given the cutting-edge nature of many of the systems required, a solid operational team is really essential to maintain and tune the system for continuous operation.  Many of these products have complex tuning requirements demanding specialist skill with dedicated headcount.  Some of the commercial open-source offerings have support packages that can help mitigate this requirement, but either way, the need for operational resource must never be ignored if the project is to be a success.

The technologies highlighted here are evolving rapidly, with variants or entirely new products appearing frequently, as such it would not be unreasonable to expect significant advancement in this field within the 6-12 month timeframe.  This will likely translate into advancement on two fronts: increased and functional capability of the leading distributed data platforms in areas such as query interface and indexing capability, and reduced operational complexity and maintenance requirements.

Tim Seears – CTO Big Data Partnership.

www.bigdatapartnership.com

Big Data Partnership and Broadgate Consultants are working together to help organisations unlock the value in their big data.  This partnership allows us to provide a full suite of services from thought-leadership, strategic advice and consultancy through delivery and implementation of robust, supportable big data solutions.  We would be delighted to discuss further the use case outlined above should you wish to explore adapting these concepts to your own business.