Insurance and big data: what’s the big deal?

On 24 November 2015, the Financial Conduct Authority (FCA) published a Call for Inputs on the use of ‘Big Data’ in the retail general insurance sector. Big Data refers to the use of new datasets covering, for example, social media activity, purchasing behaviour, or driving data collected by a ‘black box’ telematics device, as well as advanced technologies to generate, store, process and analyse such data. The FCA seeks to understand how Big Data is likely to affect consumer outcomes and competition in the sector, both now and in the foreseeable future.

The Call for Inputs follows previous work related to this topic, including our research on the collection and use of consumer data for the Competition and Markets Authority (CMA), which focused on the motor insurance sector amongst others.

The Big Data revolution?

Our research found that, though insurers have always relied on data as an input, innovative uses of data could further enhance insurers’ efficiency. Big Data might provide a fuller picture of individuals to enable precise risk assessments, or it might help insurers detect behaviour that may be fraudulent.

However, whilst insurers and brokers do generally seem to be exploring ways to evaluate risks more accurately by exploiting new data sources and techniques, industry views on the scale of the Big Data ‘revolution’ differ.  Claims of a data gathering ‘arms race’ by some stakeholders meet with marked scepticism from others.

In any case, Big Data does not appear to have transformed the sector thus far. Therefore, it is important to acknowledge that the benefits and concerns related to Big Data may be largely hypothetical at this point in time.

Whilst many applications of Big data are still in their infancy and there is scepticism among some stakeholders about their true potential, there is nevertheless an expectation that firms and consumers both stand to benefit from such changes.

Potential areas of concern

However, there are also some concerns around expanding data collection and usage. Focusing, as the FCA’s Call for Inputs does, on the use of data for the purposes of risk evaluation,1 our research found that Big Data could make insurers’ approach to risk assessment  less transparent. Insurers may take into account more variables and data points (in addition to ‘traditional’ variables, such as vehicle value, driving history etc.) and apply more complex analytical techniques to them. Such complexity makes it more difficult to understand how an insurer assesses risks and determines the corresponding premiums. Moreover, insurers have legitimate incentives to restrict transparency over this process, as it can be a key source of competitive advantage.

With reduced transparency there is a risk that Big Data could harm competition and consumers, as well as making it more challenging for regulators to monitor firms. Possible concerns may include weak consumer engagement, barriers to switching, invasions of privacy, unfair or discriminatory pricing, and the exclusion of particular consumer groups. Any such concerns, though, should be considered together with any potential efficiency gains from Big Data.

Consumer engagement

A lack of transparency can be a barrier to consumer engagement in a market. For example, the CMA’s energy market investigation found that informational problems contributed to low engagement and weak consumer response – customers lack information on how their behaviour can affect consumption and bills can be complex and opaque.2 In the same way, insurance customers might know less and less about how premiums are determined and how their behaviour can affect their premiums (e.g. where risk assessments are based on a vast range of data about shopping patterns, social media activity, etc.). This situation could compound existing barriers to accessing and assessing information in a sector where some insurance policies are longer than George Orwell’s Animal Farm)3. Weak consumer engagement might result in less effective competition and firms holding market power over unengaged customers.


A further informational issue might arise in relation to the portability of information. As more and more datasets become relevant for risk assessment, it might be that not all insurance providers have access to the same datasets. In this case, a policyholder might be restricted when shopping around if not all relevant information can be transferred  to alternative providers. This may apply in particular to driving data collected by telematics devices or apps – such data is currently not standardised or transferable between providers. There is therefore the potential for customers to become ‘locked in’ to their current provider, weakening competition. This scenario might only be of concern if telematics policies become a mainstream product, which remains uncertain.4


Limited transparency could also raise privacy concerns. There is evidence that consumers currently are unaware of the range of data used by insurance providers (and would object to such use if they knew).5 The situation could be made worse if insurers were to draw upon new sources of personal data. This issue may not be specific to the insurance sector; it arguably reflects broader concerns about consumer awareness and ability to control the data that companies collect and use. The fiction of presumed consent becomes ever harder to sustain where approving privacy policies without reading them is almost the standard behaviour.6 Even if consumers attempt to read them, privacy policies are often long, legalistic, vague and not easily comparable between companies. Insurers are mindful of the importance of consumer trust and are likely to take any potential ramifications into account when considering Big Data projects.

Fairness and discrimination

Economic theory indicates that enhanced risk classification may have different consequences for efficiency and equity.7 Various economic models test the effect of regulatory restrictions on risk classification, often finding that such restrictions limit overall efficiency but may have desirable distributional consequences.  Conversely, Big Data-enhanced risk classification theoretically might boost efficiency but have a negative impact on distribution.

The FCA identifies micro-segmentation as an area of interest. The idea is that Big Data might allow much more accurate risk assessments, such that customers are grouped into smaller, more homogeneous risk pools. The net effect on consumers is ambiguous, but there is a possibility of high-risk consumers being ‘left out’, who would previously have been given cover as part of a larger risk pool where lower-risk individuals subsidised the higher-risk ones. Whether or not this is a concern may depend on whether one views insurance products as a necessity that should be available to all, through the spreading of risk and subsidisation of higher risks where necessary.  In this view, inclusivity would clearly be a priority and the exclusion of certain groups would be particularly concerning if it affected vulnerable consumers.  However, it may be too soon to judge whether this is likely to be the case.

Standing out

The use of complex models using a broad range of variables can also lead to outcomes that could be deemed unfair. For example,  motor insurance premiums might be strongly affected by characteristics that have no obvious relation to driving risk.   While insurers may well have incentives to identify and disregard spurious correlations,8   it should not be assumed that they would do so; indeed, doing so may be more difficult as the volumes of data and complexity of algorithms increase. Theoretically there might even be a question of possible discrimination against certain groups and doubts over the legality of certain practices (e.g. where certain ethnic or religious groups are ultimately priced differently by Big Data models, though with no deliberate intent to discriminate against them).9 With limited transparency, such outcomes may be inherently difficult to observe.


Our research found that some insurers seem to have adopted a relatively cautious approach to Big Data. Given the range of possible concerns, this should not be surprising. Insurers are likely aware of the risk of consumer backlash in response to expanded data collection and use. Some insurers also expressed doubt about the extent to which Big Data can improve the accuracy of risk evaluation, judging that improvements may be marginal and that the factors traditionally relied upon by insurers (e.g. demographic characteristics, driving history, vehicle characteristics) will continue to hold by far the most predictive value. Therefore, the jury is still out on whether Big data will ultimately be a big deal for insurers and consumers.

Print This Post Print This Post



  1. Big Data might also have important effects on other processes, such as fraud prevention and detection. For a broader discussion, see our report published by the CMA. []
  2. See paragraph 123 of CMA, July 2015, Energy market investigation – Summary of provisional findings report. []
  3. See page 11 of CII, 2015, Big data and insurance: a conversation. []
  4. Recent research shows that, while take-up of telematics policies continues to grow, the rate of growth may be slowing. See BIBA research reported in Telematics Wire, May 2015, 323,000 insurance telematics policies are live in UK: British Insurance Brokers’ Association. []
  5. For example, see the Consumer Intelligence research reported in Your Wealth, November 2014, Third of insurance consumers unaware insurers use social media to check claims. []
  6. For example, research commissioned by the FCA found that “[i]t was evident that ticking the box to say they had read the T&Cs both on the PCW and the insurer website was almost an automatic response and something of an accepted social norm“.  See Atticus, June 2014, Price comparison website: Consumer market research. []
  7. For example, see Dionne and Rothschild, 2014, ‘Economic Effects of Risk Classification Bans’, The Geneva Risk and Insurance Review 39.2: 184-221. []
  8. An oft-cited result of statistical analysis is that there is near-perfect correlation between per-capita cheese consumption and the number of people who die after becoming entangled in their bed sheets. For more examples, see []
  9. For further discussion of these issues see Swedloff, 2014, ‘Risk Classification’s Big Data (R)Evolution’, Connecticut Insurance Law Journal, Vol. 21. []

Comments are closed.