Why is the market calling for Big Data Analytics?

Why is the market calling for Big Data Analytics?

Data is a key asset to any business wishing to join the 4.0 Industry. From searching for what your customers want, to knowing how your stock is changing or measuring the evolution of your main KPIs, data is indispensable to any organization wishing to survive. However, when data is wrongly interpreted it can hide vital information and blur decision-making. When the volume of data increases and it becomes unmanageable, Big Data Analytics culture becomes essential to help us make sense of not only what is happening, but mainly why.

Why is the market calling for Big Data Analytics?

To understand what Big Data Analytics is, first we need to understand the magnitude of Big Data. Let’s picture an organization in the food industry with international business, a portfolio of thousands of products and with a wide supply and logistics chain, including several stakeholders ranging from animal breeders to supermarket directors.

Inside this chain, each and every product produced, its costs, prices, distribution and selling means are mapped daily, generating millions of important data, which could be used to better understand the business and help decision-making.

All this information could be stored in sheets if the registry doesn’t go beyond 1 million lines ( PT-BR understand sheet limitations). As expected, this case exceeds this threshold and data ends up being collected and scattered among many sheets, sectors, processes, and so on. With that, some questions arise:

  • How can we know which markets are thriving?
  • Which are the market trends in different regions?
  • Where are we short in stock and where is it excessive?
  •  Why am I losing market share?
  • Which are the main bottlenecks of my supply chain?
  • Which factors are more relevant in my profit margin by product, city, state, country?

Big Data Analytics is key to unravel these and many other questions.

Dimensions of Big Data Analytics  

The 5Vs of Big Data

What Big Data Analytics does is deal with huge volumes of varied data with velocity and veracity, aiming to transform them into business value. These are the 5 Vs of Big Data and understanding them is fundamental to perceive where can we apply it to your business:

  • Volume: Big data deals with huge amounts of data, turning them into information and then into knowledge. It is not uncommon here in our project to deal with millions of samples and thousands of factors. Drawing a parallel, picture a sheet with millions of rows and thousands of columns. It would be hard to make sense of something without Big Data Analytics, wouldn’t it?
  • Variety: data acquisition can include multiple departments and sources inside an organization. We may need to collect data from clients, cross them with geo-populational data warehouses, government data, among others. All following current laws, of course! Discovering and grouping relevant data while keeping a foot on the ground is a big challenge that demands a mature data culture;
  • Velocity: naturally, acquiring data is not enough. We need to devise effective strategies to transform it into knowledge as quickly as possible, before the competition takes the lead. In the race for information, interpret data promptly is power;
  • Veracity: here at Aquarela we usually say that running models, chewing numbers and reaching results is the easy part. The difficulty lies in making sure that our analysis is leading us to coherent, real and high-value conclusions. In the end, a Big Data model is as good as the data we put into it. It is up to us to understand what makes sense and guarantee that the result mirrors reality;
  • Value: it is in this V that lays the main result that drives Big Data Analytics: transforming data into value. After all, understanding what is going on and why things are happening is fundamental to support a more consistent and accurate decision-making. 

Big Data Analytics increases business intelligence. While traditional analysis aims for explaining what is happening (in a very limited scope by the way), the use of Advanced Analytics is capable of finding the whys, what is hidden, or even feresights of what is going to happen. Big Data Analytics is a huge ally in developing new products, reducing cost and increasing efficiency, besides helping quick and safe decision-making.

Right data + right questions = right answers

Have you ever imagined opening a 6 billion cells sheet in Excel? Or, if the sheet opens, try and find behavior patterns that make sense and help you understand what is going on? Hard, isn’t it? The first challenge of Big Data Analytics is always to gather (or mine) data, a work that Data Engineers master. Data mining is vital (and usually a huge bottleneck) so that we can access data and groom it to analysis. Only the right data is able to offer us the right answers. 

So now we are ready to start with the analysis, with data extracted, transformed, loaded (ETL), clean and coherent. At this point enters quantitative studies with mathematical models, or even Machine Learning models. That so we can use data so solve a wide range of problems. At this point, Data Scientists and Machine Learning Engineers enter in the search of solutions, which are often hidden. It’s the role of this team to create a scalable architecture, understand the client’s real problemas and fulfill the 5 Vs.

Big Data Analytics Maturity in Brazilian market

Clearly, reaching such maturity and proficiency in the 5Vs in a culture that is data-driven and with well-defined governance processes is not an easy task. 

To investigate this fact, we performed a research study in 2018 that generated a report revealing the reality of data maturity in Brazilian organizations. In a scale from 1 to 5, participant companies reported their level of maturity.

We present the results below, indicating that the automation of intelligent behavior (level 5) is still low, while the great majority of them already have BI (business intelligence) systems implanted (level 3).

Data maturity level results in Brazilian organizations (Aquarela 2018)

Big Data for Big Business

The projected revenue from business and Big Data Analysis is expected to reach 274.3 billion dollars by 2022 (IDC), with companies like Netflix saving up to 1 billion dollars per year by using Big Data (TechJury). To such organizations, a mature data culture is essential to keep growing and stay at the edge of the market. Organizations that use Big Data, either internalizing it or with partners, perceive a raise between 8 and 10% on profit (Entrepreneur), with benefits such as (Chicago Analytics Group):

  • Innovation cycles 25% faster;
  • Raise in 17% on efficiency and productivity;
  • Research and Development 13% more efficient;
  • 12% more differentiation on products and service development.

No wonder many companies are searching for improving their relationship with data. Unfortunately, data culture is still rare globally. Around 87% of the companies still have low maturity in matters of business intelligence and analytics (Gartner). The costs of such misinformation and low quality of data sum up to 3.1 trillion dollars by year only in the US economy (IBM).

The rapid growth and high complexity of Big Data Analytics make evident that the industry needs support. Organizations specialized in analytics can help propelling the digital transformation, especially for quick implementation of data solutions and artificial intelligence. Several organizations take high technological risks by trying to internalize such activities that are far away from their core business, while it would be much safer and more profitable to work together with companies specialized in Analytics.

Big Data Advanced Analytics culture at Aquarela

At Aquarela, our Big Data Advanced Analytics culture was developed and evolves constantly with focus on the Big Data Vs, good practices of data governance and in the enhancement of the technological stack that composes our VORTX platform.

We seek to deliver an experience of results based on analytics, capable of changing our clients culture. Our goal is to help the development of the industry and services that are experiencing an intense and needed process of digital transformation. To achieve this, we trust our clients and seek solutions with them, with all parts being equally essential for each project’s success. This goes beyond isolated data analysis, because collaborative evolution is an intense process guided by data, business experts, and information and technology specialists.

As results from our culture, we are able to upgrade the data maturity of our clients, providing dynamic and intelligent automated pricing systems, logistic actions recommenders, strategic maps of business intelligence integrated through AI, and industrial products optimizers also using AI. It is our big range of solutions that generate an expanded itelligence, which would not be possible without all data culture components acting synergically inside a clear vision on what artificial intelligence is.


Big Data Analytics is a very wide topic, and the 5Vs help us simplify this concept for managers to promote practical changes in the reality of organizations. Today, many companies present difficulties for reinventing themselves in this new digital economy, being for technical limitations in the intensive use of sheets, or by methodological and cultural limitations related to data.

In this article we undertook the task of showing you how the market is demanding more and more for analytics, which business points are more important in this matter, and the current data maturity levels of brazilian organizations. The main themes we recommend managers to keep an eye are:

  • Data governance
  • Development of a data-driven culture
  • Optimization of distribution chains, logistics and commercial processes design
  • Data privacy and ethics
  • Data Analysis team training – We provide an e-book on analysis structuring (PT-BR)

Our interdisciplinary squads work daily with cutting edge technology to understand the challenges, find opportunities and solve your biggest problems. If in 4.0 Industry data is power, we thrive in empowering our clients to transform data into information, information into knowledge, and knowledge into strategic value to your businesses. It’s through the digital transformation that Aquarela expands the world’s intelligence.

Which Big Data Analytics challenges are you facing today? And what are you doing to overcome them? Leave us a comment!

What is Aquarela Advanced Analytics?

Aquarela Analytics is Brazilian pioneering company and reference in the application of Artificial Intelligence in industry and large companies. With the Vortx platform and DCIM methodology, it serves important global customers such as Embraer (aerospace & defence), Scania and Randon Group (automotive), Solar Br Coca-Cola (beverages), Hospital das Clínicas (healthcare), NTS-Brasil (oil & gas), Votorantim Energia (energy), among others.

Stay tuned following Aquarela’s Linkedin!

The Future of Financial Analysis with Advanced Analytics.

The Future of Financial Analysis with Advanced Analytics.

The way to conduct financial analysis is changing fast. In the last two decades, companies generally have undergone an intense process of computerization initiated by the field of accounting, with the use of management systems such as ERPs and CRMs. Today they produce much more ta anata than ever before and this equity needs to be analyzed from both the finance and investment standpoint as well as Data Analytics and Advanced Analytics.

In this article we will briefly compare the main changes that are occurring in the way financial analysts work with regard to their future in relation to the area of ​​Advanced Analytics.

Key motivators of changes in how to do financial analysis

The connectivity of recent years has generated new business models that could never before be imagined, being able to serve varied audiences 24 hours a day and with an unprecedented scalability in history, such as Uber for example. In addition, the volume of data has grown in size and complexity, creating a potential for insights and transformations in business that can be sure of purely financial analysis, done only by conventional methods.

Traditional Financial Analysis and Advanced Analytics Techniques

Financial analysis has well-established and widespread analysis methods. To say whether or not a company is interesting for business or investment is a task that is often satisfied by the analysis of past accounting / financial indicators. To do this, each analyst has specific criteria to evaluate the economic and financial feasibility of new investments, serving both corporate finance and personal finance.

Data Analytics methods, in turn, can be used to automate and optimize financial decisions, according to methods that are already used in the area. However, it is also possible, when using Advanced Analytics techniques, to incorporate machine learning and artificial intelligence algorithms to develop predictions in an innovative and yet unexplored way in the market: this is what will generate a great competitive advantage for companies in the financial market. industry 4.0.

Data Analytics and Advanced Analytics in practice :

Automate and optimize financial analytics with Data Analytics:

creation of datasets with automatic updating with macroeconomic data (basic interest rate, inflation, GDP, among others);

collecting data from financial statements in an automated way, either from companies open via API of already existing databases, or from closed companies, through extraction of tables of PDF files, for example;

creation of automated descriptive reports on the main changes in the macroeconomic scenario.

Make predictions with Advanced Analytics – machine learning, artificial intelligence, among others:

models adaptable to changes in economic and financial reality, capable of making recommendations and indicating directions for decision making.

If the financial analyst uses spreadsheets, such as Excel, to do his analyzes, he can then optimize the data extraction and cleaning processes with Data Analytics techniques and in the end obtain an output from an Excel spreadsheet so that he can work and perform the financial analyzes you are already accustomed to do. However, the great competitive advantage lies in the hands of analysts who can use Advanced Analytics to transform the way in which they perform their own financial analysis.

The area of ​​finance is also strongly influenced by the use of econometric methods to make forecasts. However, the use of conventional econometric models usually refers to models that are static. Several tests of robustness are usually made to validate such models, but the problem is that much of it is not adaptable to changes in economic and financial reality, typical situation due to the dynamism of financial markets. This versatility and adaptability to change are characteristics of models that use machine learning and artificial intelligence techniques in a coherent implementation of the data analytics culture among financial analysts.

The Data Analytics culture presents a different way of acquiring analytical knowledge from the traditional model. Achieving Analytics knowledge is more decentralized by the effect of the internet and the sharing of programming codes in package form (influence of computer science and versioning techniques). That is, instead of the analyst spending months or even years creating all the calculations in an isolated way in an Excel spreadsheet to reach a conclusion, with the culture of Data Analytics it is possible to import complete sets of codes that perform complex analyzes on the data in minutes, greatly speeding up the process.

To get an idea of the growth of this type of approach in problem solving, we present below the volume of packages added to the main repository of R language packs – CRAN.

The possibilities become so broad in this new mode that, in a few seconds, it is possible to install and execute commands for automatic generation of Internet Memes, like this one, with only 4 command lines.

For more information on this small package, see this article.


The same process facility extends to financial packages, such as:

TTR, tidyquant, PerformanceAnalytics, PortfolioAnalytics, quantmod, Quandl, among others.

Earlier we wrote about the need to incorporate R, given the traditional limitations of Excel – Leaving from Limited Excel to R or better Python? 

Comparison of traditional methods of financial analysis, Data Analytics and Advanced Analytics Typically, traditional financial analysis methods include stable, well-judged valuations without the need for presentation or discussion of the methods used. Already in Analytics methods, communities share codes and tools, not just concepts. See the table below for a comparison of the two approaches.

  Financial analysis Data Analytics Advanced Analytics
Replication level and analysis speed Low, since each worksheet is auto contained and changes are not shared Intermediate, using good practices of scripting and collaborative work. Spreadsheets in Dataset format. High, using structured systems to operate in a distributed way. Multiplatform of scalar form.
Use of Artificial Intelligence low Intermediate High
predictive analysis Trend analysis with strong use of temporal series, use of regression methods. In general, these are robust but static models. Predictions with statistical weights of all variables analyzed, with a wide range of generic algorithms available for analysis Continuous improvement of accuracy, speed and assertiveness of predictive models with weights in all variables discovered by the algorithms themselves.
analyzes focus Internal financial health data of the organization, comparison with similar organizations. Macroeconomic analyzes made on the basis of theoretical premises. Internal data, data linked to macroeconomic aspects, analysis of texts (such as minutes and explanatory notes), investigation of relationships also with non-financial data. Internal and external data at various levels of granularity.
Key Analysis Tools Excel and transactional systems such as:


ERPs (Link)

CRMs (Link)

SCMs – (Link)

Statistical and econometric software, such as: SPSS, Eviews, Stata.

R, Python or other specific programming notebooks, Data cleaning tools, data mining algorithm suites.


Pure text editors (example: Sublime)


Git – Code versioning and creative artifacts

Machine learning platforms and artificial intelligence, which contemplate the use of several algorithms. Use of distributed computing platforms, such as Spark and Hadoop.
license Closed Code Tools Open Code Tools Mix of open and closed code tools
Analyst main activities Analysis of financial statements and indicators. Development of economic / financial reports. Definition of financial analysis structures, preparation of Datasets, information flow of the indicators that compose the datasets. Not limited to financial indicators. Implantation of large-scale models in an integrated way to the transactional systems.

Final considerations and recommendations

The deeper impact of shifting the profile of financial analysis to Analytics paradigms occurs in the nature of the work of financial analysts, which becomes oriented to package orchestration and data flow through scripts, with less technical dependence on the IT sectors and Development.

For those who work in the area of financial analysis and intends to adapt to new market trends, increasingly based on data, we recommend an in-depth study of the basic packages of programming languages (mainly R and Python), how to use code versioning methods (such as Git or Github), participate in Data Science best practices communities in your region, or even online communities.

What is Aquarela Advanced Analytics?

Aquarela Analytics is Brazilian pioneering company and reference in the application of Artificial Intelligence in industry and large companies. With the Vortx platform and DCIM methodology, it serves important global customers such as Embraer (aerospace & defence), Scania and Randon Group (automotive), Solar Br Coca-Cola (beverages), Hospital das Clínicas (healthcare), NTS-Brasil (oil & gas), Votorantim Energia (energy), among others.

Stay tuned following Aquarela’s Linkedin!

Industry 4.0, Web 3.0 and Digital Transformation

Industry 4.0, Web 3.0 and Digital Transformation

Industry 4.0 is characterized by the change on the flow of value from centrally designed and resource-intensive products to knowledge-intensive decentralized services designed and produced with strong support from Advanced Analytics and IA throughout digital transformation.

This process has its beginning with the Internet boom in the first decade of the millennium. 2018 seems to be the year of emancipation of Industry 4.0; which ceases to exist only in scientific articles and laboratories, evolving with vigorous support from the budgets of the largest corporations in the world, according to research by the OECD, Gartner Group and PWC.

From our point of view, the Industry 4.0 is materialized from the concepts of Web 3.0, whose core lies in the democratization of the capacity for action and knowledge (as already discussed in this blog post). But before we get to 4.0, let’s understand their previous versions in perspective:

Industry 1.0

Characterized by the discovery of economic gains by producing something in series rather than artisanal (individual) production, making it possible to mechanize labor, which was previously only performed by people or animals. It was the moment when man began to use the force of the waters, winds and also of the fire, from the steam engines and mills.  In 1776 Adam Smith (The Wealth of Nations) presents the advantages of segmenting work in a pin factory. (know more)

Key Components – Coal and Steam Engines.

Industry 2.0

Its major driver was the electricity that, from generators, motors and artificial lighting, allowed to establish the assembly lines, and thus was given the mass production of consumer goods.

Key Components – Electricity and Electromechanical Machines

Industry 3.0

Characterized by automation, its driving force is the use of robots and computers in the optimization of production lines.

Key Components: Computers and Robots

Industry 4.0

Industry 4.0 is characterized by the strong automation of the design, manufacturing and distribution stages of goods and services with strong use of CI – Collective Intelligence – and AI – Artificial Intelligence. In Industry 4.0, with the evolution of the Web, individuals are increasingly empowered by their agents (smartphones). Giving up to the needs of this new consumer is one of the great challenges of the new industry.

To better illustrate this concept we created the following table:

GenerationsConcept (Design)ManufactureDistributionServicesOutcome
Before industry agePeoplePeoplePeoplePeopleHand-made work
Industry 1.0PeopleMachinesPeoplePeopleUse of electric, thermic, hydraulic energy
Industry 2.0PeopleMachinesPeoplePeopleElectric energy as a main driver, assembly line process start
Industry 3.0People using machines (computers) as assistantsMachinesPeople and machinesPeopleUse of automation (robots and computers)
Industry 4.0Collective inteligente + machinesMachinesMachinesCollective inteligente + machinesUse of computacional and collective inteligence to create products and services

In order to understand Industry 4.0 it is important to clarify some concepts that make up its foundations: AI – Artificial Intelligence and CI – Collective Intelligence.

Collective Intelligence

Let’s start with IC, which is more tangible, since we constantly use mechanisms that use collective intelligence in the production and curation of content such as: Wikipedia, Facebook, Waze and Youtube.

Wikipedia: For example, most of all Wikipedia content is produced by hundreds of thousands of publishers worldwide and cured by millions of users who validate and review their content.

Waze: The Waze application uses users’ own movement to build and refine their maps, providing real-time alternative routes to escape traffic congestion and new routes of new sections created by cities.

Facebook and Youtube are services that today have a diverse range of content that is spontaneously generated and cured by its users throughout likes and shares.

What do these mechanisms have in common? They rely on the so-called intelligence of the masses, a concept established by the Marquis de Condorcet in 1785, which defines a degree of certainty and uncertainty about a decision from a collective of individuals.

With hundreds or thousands of individuals acting in their own way, by summing all these actions, one gets a whole that is greater than the sum of the parts. This collective behavior is observed in the so-called swarm effects, in which insects, birds, fish and humans, acting collectively, reach much larger deeds than if they had acted individually.

Condorcet proved that mathematically, inspiring illuminist leaders which used his ideas as base to the formation of democracies in the 18th and 19th centuries.

In a contemporary way, we can look at a database as a large lake of individual experiences that form a collective. Big Data is responsible for collecting and organizing this data and Advanced Analytics for improving, creating and re-creating things (disruption) through intensive statistics and AI.

Artificial Intelligence

In a judicious scrutiny, it is possible to understand AI as an artificial implementation of agents that use the same principles of CI – Collective Intelligence.

That is, instead of real ants or bees, artificial neurons and/or insects are used in a computational world (cloud), that in some ways simulate the real-world behavior and thus obtains from the intelligence of the masses: decisions, responses and creations.

For instance, this piece used to support a bridge in the Dutch capital, The Hague.

On the left side is the original piece created by engineers. In the middle and on the right, two pieces created from an AI approach called genetic algorithm. The right-hand piece is 50% smaller and uses 75% less material, and yet, because of its design, it is capable of sustaining the same dynamic load of its left counterpart.

There are hundreds of cases of AI use cases, ranging from the detection of smiles on cameras and cell phones to cars that move autonomously in the midst of cars with human drivers in big cities.

Each AI use case relies on a set of techniques that can involve Machine Learning, insights discovery and optimal decision making throughout predictive and prescriptive Advanced Analytics and Creative Computing.


The intensive use of CI and AI can generate new products and services creating disruptions that we see today in some industries promoted by companies like Uber, Tesla, Netflix and Embraer.


In the case of Uber, they heavily use the CI to generate competition and at the same time collaboration between drivers and passengers, which is complemented by AI algorithms in delivering a reliable transportation service at a cost never before available.

Despite being 100% digital, it is revolutionizing the way we are transported and very soon will launch its 100% autonomous taxis and, in the near future, drones that transport their passengers through the skies. This is a clear example of digital transformation from redesign through the perspective of Industry 4.0.


Tesla uses CI from the captured data of the drivers of its electric cars and, applying Advanced Analytics, optimizes its own process and still uses them to train the AI that today is able to drive a car safely in the midst of the traffic of big cities of the world.

Tesla is a remarkable example of Industry 4.0. They use CI and AI to design their innovative products, a chain of automated factories to produce them and sell them online. And very soon they will transport and deliver their products to the buyer’s door with their new electric and autonomous trucks, completely closing the Industry 4.0 cycle.


Netflix, in turn, uses the access history to movies and notes gave by its users to generate a list of preferences recommendations that serve as input to the creation of originals such as the hits House of Cards and Stranger Things. In addition, they use the AI of the Bandit algorithm (from Netflix itself) to generate title covers and list curation, which attracts users (viewers) to consume new content.


Embraer, the world’s third largest producer of civil aircraft and the largest innovation company in the Brazil, uses AI, CI and Advanced Analytics in equipment maintenance systems.

By using these techniques it is possible, based on maintenance experiments and risk mitigation procedures applied to an IA, to reduce the costs of troubleshooting processes in high-value equipment, up to 18% savings in an industry where apparently low margins can generate considerable competitive impact.

Conclusions and Recommendations

The path to industry 4.0 is paved by the techniques of CI, AI, Advanced Analytics, Big Data, Digital Transformation and Service Design and with good examples of global leaders.

Transformation is often a process that can generate anxiety and discomfort, but it is necessary to achieve the virtues of Industry 4.0.

We suggest starting small and thinking big, start thinking about Data, they are the building blocks of all Digital Transformation. Start by feeding a Data Culture into your business / department / industry.

And how do you start thinking about Data? Start with the definition of your dictionaries, they will be your nautical charts in the middle of the Digital Transformation journey.

Understanding the potential of data and the new business they can generate is instrumental in the transition from producer of physical goods to service providers, that can be supported by physical products or not. See Uber and AirBnb, both have no cars or real estate, but are responsible for a generous share of the transportation and accommodation market.

We recommend raising the degree of maturity beginning with a diagnosis, then the elaboration of a plan of action and its application.

At Aquarela we have developed a Business Analytics Canvas Model which is a Service Design tool for the development of new business based on Data. It is possible to promote the intensive use of CI, AI in the stages of Design and Services, the links that characterize the change from Industry 3.0 to 4.0.

We will soon publish more about Business Analytics Canvas Model and Service Design techniques for Advanced Analytics and AI.

What is Aquarela Advanced Analytics?

Aquarela Analytics is Brazilian pioneering company and reference in the application of Artificial Intelligence in industry and large companies. With the Vortx platform and DCIM methodology, it serves important global customers such as Embraer (aerospace & defence), Scania and Randon Group (automotive), Solar Br Coca-Cola (beverages), Hospital das Clínicas (healthcare), NTS-Brasil (oil & gas), Votorantim Energia (energy), among others.

Stay tuned following Aquarela’s Linkedin!

Yellow Ribbon September – towards celebration of life

Yellow Ribbon September – towards celebration of life

Aquarela starts September engaged with the life valorization campaign, bringing to light a subject that has to be talked about. All the way from schools until the corporate word, mental suffering can be silently present of with colleague, neighbor or relative and a refuge can make all the difference for them.

Suicide is a phenomenon that is presents in all cultures, since the beginning of human history. It relates to characteristics related to emotional, mental social and economical aspects.

The person suffers from feelings’ ambivalence; they do not want to die, but they want to put an end to their psychic pain (or physical when dealing with chronical cases).  Since the subject is seen as a taboo, full of prejudice, the subject gets stigmatized, which difficulties the reaching of for help or simply for having a conversation. The subject is simply avoided.

However, this year, the ‘blue whale’ “fever” as well as the ‘13 Reasons Why’ TV-Series raised the public interest regarding suicide. Some parents lost their sleep and search for information and gathered help from health professionals. But, the thought of suicide, is not present only on the minds of the young; it is present in other age groups, including the elderly. And that is one more reason why suicide has to be discussed.

The good news is that suicide can be prevented, as long as it gets treated as a case of public health associated to information and prevention projects. Below follows some relevant data.

World Health Organization Data

According to the Pan American Health Organization (PATH/WHO):

  • over 800 000 people die every year from suicide;
  • suicide the the second main death cause of young people between age of 15 and 29;
  • only 60 of the 172 member nations provide data that is considered to have good quality;
  • it is estimated that 28 countries have national suicide prevention strategies;
  • in the Mental Health Action Plan 2013-2020, the WHO member states have committed to reduce the suicide rates in 10% until 2020;
  • around 75% of suicides happen in countries of medium and low income;
  • men from wealthy countries commit three times more suicide than females;
  • in high income countries the highest suicide rates are related to abuse of alcohol and depression;
  • 90% of all suicides can be avoided;
  • in Brazil the average is of 6 to 7 death for every 100 000 inhabitants, which is considered low. However, that data is not reliable, since the quality of data in our country has a lot of room for improvements.

“Every 40 seconds one person dies by suicide”

Artificial Intelligence and suicide

Artificial Intelligence (AI) can provide means for identifying patterns and suicial behavioral tendencies, helping to refine preventive actions.

Recently suicidal movements, such as the previously mentioned ‘Blue Whale’, have gained visibility through their dissemination on the social networks. There are also cases of people who manifestate their feeling individually, also through the social networks.

Considering that, the implementation of Artificial Intelligence algorithms and big data techniques can provide precise inference regarding individuals which need help. Companies like Facebook, Instagram and Google have already announced that they will use AI on their platforms for providing warnings and prevention.

But much more can be done with the new technologies, putting together technologists, teachers, professors, psychologists and other professionals. They can provide preventive measures and identify possible suicidals, and they can also provide protection through means of a support network.

An analysis from Aquarela

Based on the death records of 645 municipalities from the state of São Paulo, Joni Hoppen, one of the Aquarela’s founders, found out that:

  • from 300 000 deaths, 2.223 were suicides;
  • he identified that most of the deaths are unknown or not informed professions. The exception were masons;
  • the lack of professional identification can lead to suicide, or, health professionals and family have great difficulties describing those peoples’ jobs;
  • Joni had difficulties trying to identify if masons really committed suicide, or if the deaths are related to work accidents which were informed as suicided due to labor issues;
  • he applied a filter for “lawyers” which returned 18. The ratio of lawyers in the state in comparison which other professional occupations such as janitors, shopkeepers and security guards indicates that favorable economic situation are also present in the statistics;
  • male with high scholarity commit more suicide;

You can see the whole post (in Portuguese) here.

Humans construct their identity based on personal, social and professional relations. Jobs represent socio-historical meanings, the role of an individual in the society and this roles affects how each person is seen by the other and also how they evaluate themselves. When those visions became dysfunctional health issues such as depression and suicidal thoughts can appear..


In order for people that are considering suicide not to be ashamed or afraid of reaching out for professional help, it is necessary to have information and welcoming environment.

It is necessary to be open to their pains and sufferings, without judgment or prejudices, showing interest and being available for them.

The discussion of the issue helps the population as well institutions to establish strategies and prevention. One of the objectives when intervening  is to recover the self esteem, promote emotional well-being and to establish bonds of affection that can provide a support network for the individuals.

In Brazil, we have the Centro de Valorização da Vida (CVV) (Health Valorization Center), a NGO that provides free voluntary services of emotional aid and suicide prevention through chat, telephone, Skype and email. Alway with keeping the individual’s privacy.

Additional information:

Booklet distributed by the Conselho Federal de Medicina (Federal Council of Medicine): http://www.flip3d.com.br/web/pub/cfm/index9/?numero=14#page/1

WHO’s first report on suicide: http://www.who.int/mental_health/suicide-prevention/world_report_2014/en/


What is Aquarela Advanced Analytics?

Aquarela Analytics is Brazilian pioneering company and reference in the application of Artificial Intelligence in industry and large companies. With the Vortx platform and DCIM methodology, it serves important global customers such as Embraer (aerospace & defence), Scania and Randon Group (automotive), Solar Br Coca-Cola (beverages), Hospital das Clínicas (healthcare), NTS-Brasil (oil & gas), Votorantim Energia (energy), among others.

Stay tuned following Aquarela’s Linkedin!

Human Resources Optimised with Advanced Analytics

Human Resources Optimised with Advanced Analytics

Today we are going to present some insights related to employee’s working the satisfaction using Advanced Analytics tools and techniques. As a source for this study, we make use of the data made available on this link by the data scientist Ludovic Benistant who made important anonymizations. Some pictures have Brazilian Portuguese words, sorry about that! Let’s go!

Research Questions

Following the DCIM (Data Culture Introduction Methodology) methodology to guide this research, we came up the following questions:

  • What factors have the greatest influence on employee satisfaction?
  • What are the main satisfaction scenarios that exist?
  • What are the main patterns associated with key satisfaction scenarios?
  • What factors influence professionals to leave?

Data Characteristics

In total, 14,999 employees were evaluated, considering the following variables already sanitized by our scripts:

  • Employee satisfaction level (0 to 10) – Probably filled out by the employee;
  • Last evaluation (0 to 10) – Probably filled in by a manager;
  • Number of projects (2 to 7) – Number of projects in which the employee acted;
  • Average monthly hours (96 to 310);
  • Time spent at the company (2 to 10) – How long the person already worked in the company;
  • Whether they have had an accident at work – (Yes = 1 / No = 0);
  • Whether they have had a promotion in the last 5 years (Yes = 1 / No = 0);
  • Salary Range (Low = 1, Medium = 2, High = 3); Note: Actual values were not made available.
  • Left the company (Yes = 1 / No = 0).

Number of people per department





Frequency Analysis / Distribution of Satisfaction


The highest concentration of satisfaction is within the range of 7 to 9, and there are few people with satisfaction scores between 1.5 and 3.0.


Ranking of Influence Factors in Work Satisfaction

By processing this dataset on VORTX Big Data algorithm

  1. Average monthly hours (50)
  2. Time spent at the company (21)
  3. Number of projects (20)
  4. Salary Range (13)
  5. Left the company (10)
  6. Whether they have had a promotion in the last 5 years (9)
  7. Whether they have had accident at work (9)

The factor “Last evaluation” had no relevant influence and it was automatically discarded by VORTX.

Satisfaction Scenarios

In the table below we have the result of the processing with the separation of employees into groups done automatically by the platform. In all, 120 groups have been found, and here we will focus on only the 20 most relevant and leave the others out as isolated cases and not the focus of the analysis.


Model Visual Validation

Typically managers, as far as we have experienced,  are not sure regarding machine’s ability automate the discovery of insights. Therefore, as proof of the model, we chose to show the raw data visually to demonstrate the insights aforementioned.


The pattern of hours worked by the 588 people in scenario 9 (very dissatisfied). X Axis = Monthly working hours.



The pattern of hours worked in the largest scenario (1), which has 4085 employees, a good job satisfaction and a low level of job evasion. X Axis – Monthly working hours

In the view below, each circle represents a contributor in four dimensions:

  • The level of satisfaction on the Y axis.
  • Average hours per month on the X axis.
  • Orange colors for people who left the company and blue for those who remain.
  • Circle size represents the number of years in the company.


Alright, we just saw the overall pattern including the whole organization, so what would happen if we see it by the department?





Conclusions and Recommendations

This study shed some light on the improvement of human resource management, which is at the heart of today’s businesses. Applying data analytics algorithms in this area allows automating and accelerating the process of pattern discovery in complex environments with, let’s say 50 variables or more. Here it was just a few. Meanwhile, the search for patterns in a traditional BI continues to be a purely artisanal work with a well know imitation of 4 dimensions per attempt (read more on this at Understanding the differences between BI, Big Data and Data Mining). The automation of discovery is an extremely important step in predictive analytics, in this case, the evasion of highly qualified professionals and possible dissatisfactions overlooked by management.

With VORTX’s ability to discover the different scenarios, we were able to analyze the data and conclude that:

  • People in group 1 and 2 (55% of the company) have a reasonable work satisfaction with a weekly load of 50 hours on average, without receiving promotion or suffering an accident at work.
  • The pattern persists in all departments.
  • The most satisfied groups of the 20 largest were the 7 and 10 who worked more than 247 hours a month, took on several projects but as they did not receive promotion they left the company. These people should be retained since there seams to be highly qualified.
  • Group 16 proves that it is possible to earn a good salary and be dissatisfied. These 77 people should be interviewed to identify the root cause of such unsatisfaction.
  • The cut-off line for non-company employees is: minimum 170 and maximum 238 hours worked per month.People with more than 3.5 years of work harder and are more satisfied.
  • Monthly hours above 261 resulted in very low levels of satisfaction.
  • Monthly hours below 261 with a number of projects greater than 3 turns out in high job satisfaction.
  • Scenario 15 shows the importance of promotion over the last 5 years of work.
  • The ones with more than 5 projects decrease their satisfaction, the ideal number is between 3 and 5. Of course, in this case, to better understand the indicator is necessary to better understand what the number of projects represents to different departments.

For managers, collecting as many indicators as possible is always good especially without interruption in all areas. More variables to enrich your model would be:

  • The distance between employee’s home and work.
  • The average time that is taken from home to work.
  • The number of children.
  • The number of phone calls or emails sent and received.
  • Gender and age and the reason for leaving the job.

We hope this information is useful for you guys in some way. If you find it relevant, share it with your colleagues. If in doubt, contact us! A big hug and success in developing your own HR strategy!


What is Aquarela Advanced Analytics?

Aquarela Analytics is Brazilian pioneering company and reference in the application of Artificial Intelligence in industry and large companies. With the Vortx platform and DCIM methodology, it serves important global customers such as Embraer (aerospace & defence), Scania and Randon Group (automotive), Solar Br Coca-Cola (beverages), Hospital das Clínicas (healthcare), NTS-Brasil (oil & gas), Votorantim Energia (energy), among others.

Stay tuned following Aquarela’s Linkedin!

Big Data Scenario Discovery, why is it super useful for decision making?

Big Data Scenario Discovery, why is it super useful for decision making?

Hi everyone, in today’s demonstration, we are going to show you how Big Data Scenario Discovery can help decision making in a profound way in various sectors. We use AQUARELA VORTX Big Data, which is a tool that is a groundbreaking technology in the machine learning field. The Dataset used for the experiment was presented in the previous post about Big Data country auto-segmentation (clustering). The differences here is that this one also includes the Gini Index (found later on) and removes the electrification rate in rural areas. Also, it seeks systemic influences towards a GOAL, in this case, we selected Human Development Index, previously the segmentation just grouped similar countries according to their general characteristics.

The key questions for the experiment:

  1. How many Human Development Index scenarios exist in total? And which countries belong to them?
  2. Amongst 65 indexes, which of them have most influence to define a High or Low Human Development Index?
  3. What is the DNA (set of characteristics) of a High and Low Human Development scenario?

Alright, hang on for a minute! Before you see the results, take a look at all variables analysed in the previous post. Then try to figure out by yourself using the most of your intuition, what would be the answer to these 3 questions. This is a very fun and very useful cognitive task to scenario validation. OK?

Results after pushing the Discoverer button:

HDI - Total

This is the overall distribution of 188 countries, where most of the countries present HDI between 0.65 and 0.75. And very few above 0.90.  In total, there are 15 different HDI scenarios, which the first 3 correspond to more than 94% of the total and that is what we are to focus on.

Scenario 1

The most common scenario and the average HDI

Scenario 2

Countries with the lowest HDI

Scenario 3

Countries with the highest HDI

Where are they located?

Screen Shot 2016-09-15 at 20.21.36

What factors influence HDI the most and the least?


The list marks the top and bottom 10 factors. The factor Intimate or Nonintimate partner Violence ever experienced 2001-2011 – Was automatically removed from the ranking as it does not correlate with HDI.

What is the DNA of each main scenario?

Screen Shot 2016-09-15 at 19.56.15

All factors presented at once. Note that the scales on X axis changes dynamically hovering the mouse on VORTX data scope screen.

Screen Shot 2016-09-15 at 19.56.06 Screen Shot 2016-09-15 at 19.55.57

Drilling down into the DNA

Under-Five Mortality rates vs HDI

Screen Shot 2016-09-15 at 19.51.05

Screen Shot 2016-09-15 at 19.51.19

Screen Shot 2016-09-15 at 19.51.30

Filtering visualisation by the most relevant factor and HDI (HDI is the focus of the analytics so it has the darker colour. Here we see that countries with the highest HDI have lowest levels of under-five mortality rate.

Gender Inequality Rate vs HDI

Screen Shot 2016-09-15 at 19.55.12

Screen Shot 2016-09-15 at 19.55.31

Screen Shot 2016-09-15 at 19.55.41

Gross National Income GNI per capta vs HDI

Screen Shot 2016-09-15 at 19.53.38 Screen Shot 2016-09-15 at 19.53.25 Screen Shot 2016-09-15 at 19.53.15

Insights and Conclusions of the study

The possibilities generating new knowledge from this Big Data strategy are endless, but we focused on just a few questions and few print screens to demonstrate its value. During this research, we found interesting to see the machine autonomously confirming some previous intuitions, while breaking some preconceptions. It is important to mention that we are not measuring causation as if one factor leads to another and vice-versa, the results show systemic correlations only. Here there are some of them that called our attention:

  • Gender inequality playing a strong role and inverse correlation in Human Development Index while we are living a transition of the industrial age to information where knowledge if surpassing the physical differences between genders.
  • Research and development having a direct correlation to HDI.
  • The United States having its own scenario due to its unique systemic characteristics.
  • Gross National Income GNI per capita leading the ranking and the values around 40 thousand dollars.
  • Public expenditure ahead of Education related indexes.

Business applications

Applying the same questions we had at the beginning of the article, now let’s see how they would look like for different business scenarios:


  • How many scenarios exist for your sales? Which customer segment belong to each scenario?
  • Amongst several business factors, which of them have the most influence to define a High or Low revenue?
  • What is the DNA (characteristics) of a High and Low revenue scenario?


  • How many production/maintenance scenarios exist for your production line? Which processes belong to each scenario?
  • Amongst several production factors, which of them have the most influence to define a High or Low outcome or High or Low maintenance/costs?
  • What is the DNA (characteristics) of a High and Low production/maintenance scenario?


  • How many patient scenarios exist for a specific disease or medical condition? Which patients belong to each scenario?
  • Amongst several patient characteristics, which of them have the most influence to result in High or Low levels of a specific disease or medical condition?
  • What is the DNA (characteristics) of a High and Low medical condition scenarios?

All in all, we expect that this article can help easy landing on the newest territories of machine learning and in case you need more information on how this solution applies to your business scenario, please let us know. If you found this analytics interesting and worth spreading, do so. Super thanks on behalf of Aquarelas team!

What is Aquarela Advanced Analytics?

Aquarela Analytics is Brazilian pioneering company and reference in the application of Artificial Intelligence in industry and large companies. With the Vortx platform and DCIM methodology, it serves important global customers such as Embraer (aerospace & defence), Scania and Randon Group (automotive), Solar Br Coca-Cola (beverages), Hospital das Clínicas (healthcare), NTS-Brasil (oil & gas), Votorantim Energia (energy), among others.

Stay tuned following Aquarela’s Linkedin!