Big data is increasingly used across multiple industries to better understand customers, competitors, trends, and more.

Big Data Analytics is the use of a large collection of data gathered and collected from inside and outside the company. Making use of such datasets is generally a very complex thing to do and using traditional processing applications may not be enough. This gap in the traditional processing applications has actually stimulated the burgeoning and growth of multiple companies, interested in capitalizing on Big Data Analytics.

There are several definitions of Big Data Analytics. This can create complexity, given the presence of complex linkages and hierarchies among all data (Troester 2012). Academic literature does not agree on one unique definition of Big Data Analytics. Three different perspectives (Hu et al.2014) are possible:

According to the attributive definition, “Big Data technologies describe a new generation of technologies and architectures, designed to economically extract value from very large volumes of a wide variety of data, by enabling high-velocity capture, discovery, and/or analysis” (Carter 2011).

  • Based on the comparative definition, instead, “Big Data are datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze” (Manyika et al. 2011).
  • The architectural definition cites Big Data as projects “where the data volume, acquisition velocity, or data representation limits the ability to perform effective analysis using traditional relational approaches or requires the use of significant horizontal scaling for efficient processing.”

Definition of Big Data Analytics
Big Data Analytics is one of the next Big Thing in organizations. Big Data Analytics came into the scene in the beginning of the twenty-first century. The first organizations to embrace it were online and startup companies. Companies such as Google, eBay, LinkedIn, and Facebook relied on Big Data Analytics from the beginning. Google succeeded in the business of helping persons in searching through millions of websites and zettabytes of data in order to provide near-instantaneous results with pinpoint accuracy (Cutroni 2010). Various Big Data Analytics methods and solutions help in obtaining this result. In the past decade, a variety of industries in the finance, manufacturing, retail, and technology sectors have been using Big Data Analytics to improve their processes or to better understand and deliver services to their customers.

Big Data Analytics generates value from the storage and processing of very large quantities of digital information. Traditional computing techniques are not efficient in this case. Big Data Analytics is similar to “small data” but relatively bigger in volume. Having more data requires different approaches:

  • Techniques, solutions, and architecture
  • Solutions for new problems or for old problems in a better way

The reasons for the interest in Big Data Analytics are as follows:

  • The growth in the quantity of processable data
  • The increase in data storage capacities
  • The increase in data processing power
  • The availability of data (different data types)

Big Data Analytics provides opportunities in existing environments. It also creates new opportunities for financial institutions’ stakeholders. These opportunities were not possible by dealing with structured content in traditional ways. Big Data Analytics has three characteristics—the so-called 3 Vs: 

  • Volume: The quantity of data should be relatively large. The word “relative” refers to the organization: a small organization might consider as Big Data Analytics a relatively lower volume of data with respect to large organizations. Big Data Analytics refers to the large—and exponentially growing—amount of data flooding in and out of every financial services company and that have been internally generated.

Examples of these can be found in a variety of sources, including:
–– the structured granular call detail records (CDR) in a call center;
–– the detailed sensor data from telematics devices, such as personal computers (PCs)s, mobile, ATM, Point of Sale (POS), and so on;
–– external information, including open data, marketing research, and other behavioral data;
–– unstructured data from social media, reports of different types, and so on. 

  • Velocity: Financial institutions must be able to process, access, analyze, and report huge volumes of information as quickly as possible in order to make timely decisions, especially in the operational environment. Financial institutions also need to (Bhargava 2014):

–– reduce latency to optimize transparency, cross-selling, and upselling in the different channels;
–– provide quick enterprise Intranet documents search to study the impact of different events and decisions;
–– decrease the business delivery time for reports in a data warehousing environment. There is the need of resources and solutions for fast processing of the data, in such a way that they cannot “age” too much:
–– clickstreams and ad impressions capture user behavior at millions of events per second;
–– machine-to-machine processes exchange data between billions of devices; and
–– infrastructure and sensors generate massive log data in real time. 

  • Variety: The majority of organization’s data (estimated on average around 85%) is unstructured. This means that further elaborations are necessary in order to analyze data that do not flow into the organization in a constant manner; peak loads may occur with daily, seasonal, or event-triggered frequencies. Furthermore, different sources may require different architectures and technologies for the analysis (audio, text, video, and so on). Data can come from disparate sources beyond the usually structured environment of data processing. It would include mobile, online, agent-generated, social media, text, audio, video, log files, and more. Big Data Analytics is not just numbers, data, and strings. Big Data Analytics is also documents, geospatial data, three-dimensional data, audio, photos and videos, and unstructured text, including log files and social media. The processing of such variety of information is not easy. Traditional database systems address smaller volumes of structured data, fewer updates with a predictable, consistent data structure. In general, it is possible to classify Big Data Analytics as:

–– Structured: Most traditional data sources are structured.
–– Semi-structured: Many sources of Big Data Analytics are semi-structured.
–– Unstructured set of data: such as video data and audio data.

Literature: The Future of FinTech, Integrating Finance and Technology in Financial Services, Bernardo Nicoletti, 2017